![Pandas.png](attachment:Pandas.png)

<b>Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive.</b>

# Major functionality of Pandas 


<ol>
<li>Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data</li>

<li>Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects</li>

<li>Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations</li>
<li>Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data</li>
<li>Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects</li>

</ol>

# Pandas Data structures 

The two primary data structures of pandas, <b>Series (1-dimensional) and DataFrame (2-dimensional)</b>, handle the vast majority of typical use cases in <b>finance, statistics, social science, and many areas of engineering</b>

# DataFrame


A DataFrame is a table. It contains an array of individual entries, each of which has a certain value. Each entry corresponds to a row (or record) and a column.

For example, consider the following simple DataFrame:

In [12]:
import pandas as pd  #importing pandas library 
pd.DataFrame({'Confirmed COVID Case	': [50, 21], 'Recovered COVID Case': [131, 2]}) #Creation of DataFrame 

Unnamed: 0,Confirmed COVID Case,Recovered COVID Case
0,50,131
1,21,2


<b>DataFrame entries are not limited to integers. For instance, here's a DataFrame whose values are strings</b>

In [18]:
pd.DataFrame({'Most Affected COVID-19 States  ': ['Maharastra', 'Delhi'], 'Least Affected State': ['Meghalaya', 'Sikkim']},index=['State', 'State '])#Creation of DataFrame 

Unnamed: 0,Least Affected State,Most Affected COVID-19 States
State,Meghalaya,Maharastra
State,Sikkim,Delhi


# Series

A Series, by contrast, is a sequence of data values. If a DataFrame is a table, a Series is a list. And in fact you can create one with nothing more than a list:



In [19]:
pd.Series([1, 2, 3, 4, 5]) #Creation of Series 

0    1
1    2
2    3
3    4
4    5
dtype: int64

<b>"A Series is, in essence, a single column of a DataFrame. So you can assign column values to the Series the same way as before, using an index parameter. However, a Series does not have a column name, it only has one overall name:"</b>

In [28]:
pd.Series([['64,139'], ['22,033'], ['14,704']], index=['Mumbai', 'Thane', 'Pune'], name='Number of Patients City wise ')
#As on 20 June 2020 

Mumbai    [64,139]
Thane     [22,033]
Pune      [14,704]
Name: Number of Patients City wise , dtype: object

<b>Takeway</b> :<b><i>The Series and the DataFrame are intimately related. It's helpful to think of a DataFrame as actually being just a bunch of Series "glued together".</i><b>

# Reading data files

<ul>
<li><b>Being able to create a DataFrame or Series by hand is handy. But, most of the time, we won't actually be creating our own data by hand. Instead, we'll be working with data that already exists.</b></li>

<li><b>Data can be stored in any of a number of different forms and formats. By far the most basic of these is the humble CSV file. When you open a CSV file you get something that looks like this:</b></li>
</ul>



# Reading  data files from current directory 

In [37]:
df=pd.read_csv('StatewiseTestingDetails.csv') #Creating a DataFrame from StatewiseTestingDetails.csv

In [38]:
df.head() #Displyaing first five row 

Unnamed: 0,Date,State,TotalSamples,Negative,Positive
0,2020-04-17,Andaman and Nicobar Islands,1403.0,1210.0,12.0
1,2020-04-24,Andaman and Nicobar Islands,2679.0,,27.0
2,2020-04-27,Andaman and Nicobar Islands,2848.0,,33.0
3,2020-05-01,Andaman and Nicobar Islands,3754.0,,33.0
4,2020-05-16,Andaman and Nicobar Islands,6677.0,,33.0


In [39]:
df2= pd.read_excel (r'StatewiseTestingDetails.xlsx')#Creating a DataFrame from StatewiseTestingDetails.xlsx 

In [40]:
df2.head()#Displyaing first five row 

Unnamed: 0,Date,State,TotalSamples,Negative,Positive
0,2020-04-17,Andaman and Nicobar Islands,1403,1210.0,12.0
1,2020-04-24,Andaman and Nicobar Islands,2679,,27.0
2,2020-04-27,Andaman and Nicobar Islands,2848,,33.0
3,2020-05-01,Andaman and Nicobar Islands,3754,,33.0
4,2020-05-16,Andaman and Nicobar Islands,6677,,33.0


# Reading data files from  path 

In [41]:
path ="E:\LPU\Session 2019-20 II\Logistics 19-20 II\Corona Paper\Temp\STC\Miscellenous\StatewiseTestingDetails.xlsx"

In [42]:
df3= pd.read_excel (path)#Creating a DataFrame from StatewiseTestingDetails.xlsx 

In [45]:
df3.head()

Unnamed: 0,Date,State,TotalSamples,Negative,Positive
0,2020-04-17,Andaman and Nicobar Islands,1403,1210.0,12.0
1,2020-04-24,Andaman and Nicobar Islands,2679,,27.0
2,2020-04-27,Andaman and Nicobar Islands,2848,,33.0
3,2020-05-01,Andaman and Nicobar Islands,3754,,33.0
4,2020-05-16,Andaman and Nicobar Islands,6677,,33.0


# Saving Dataset

In [48]:
df3.to_csv(r"C:\Users\login\OneDrive\Documents\PyScript\Testing.csv")