# Checking Pandas Version
The version string is stored under __version__ attribute.

In [20]:
import pandas as pd

print(pd.__version__)

2.3.3


# Pandas Series
A Pandas Series is like a column in a table.
It is a one-dimensional array holding data of any type.

In [21]:
import pandas as pd

a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)

0    1
1    7
2    2
dtype: int64


The describe function provides statistical insights into our data, we used to
use different function in numpy to obtain such inofrmation, but with pandas
employing describe alone is sufficient.

we have the flexibility to tailor the information obtained from describe using the
agg command based on our specific requirements.

In [22]:
data=pd.Series([1,2,3,4,5,6,7,8,9,10])
print(data.describe())

count    10.00000
mean      5.50000
std       3.02765
min       1.00000
25%       3.25000
50%       5.50000
75%       7.75000
max      10.00000
dtype: float64


In [23]:
print(data.agg(['max','min','sum','mean','std','var','count']))

max      10.000000
min       1.000000
sum      55.000000
mean      5.500000
std       3.027650
var       9.166667
count    10.000000
dtype: float64


# Accessing data

## Labels
If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1 etc.

This label can be used to access a specified value.

In [24]:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
# return the first value of series
print(myvar[0])

0    1
1    7
2    2
dtype: int64
1


With the index argument, you can name your own labels.

In [25]:
import pandas as pd

a=[1,2,3,4,5]
Series=pd.Series(a,index=['a','b','c','d','e'])
print(Series)

a    1
b    2
c    3
d    4
e    5
dtype: int64


In [26]:
print(Series['a'])

1


we can use key:value pair (dictionary) to create series

In [27]:
import pandas as pd

A={"1st month":"jan",
   "2nd month":"feb",
   "3rd month":"march",
   "4th month":"apr"}
myvar=pd.Series(A)
print(myvar)

1st month      jan
2nd month      feb
3rd month    march
4th month      apr
dtype: object


we can specify which element of dictionary we want in series


In [28]:
myvar2=pd.Series(A,index=["1st month","2nd month"])
print(myvar2)

1st month    jan
2nd month    feb
dtype: object


#### Slicing index

In [29]:
import pandas as pd

A_series=pd.Series([1.23,2.34,3.45,4.56,5.67])
print("using slicing to access specific data\n", A_series[2:])
print("using steps to accsses specific data\n", A_series[0::2])

using slicing to access specific data
 2    3.45
3    4.56
4    5.67
dtype: float64
using steps to accsses specific data
 0    1.23
2    3.45
4    5.67
dtype: float64


# Data Frames(Tables)

Data sets in Pandas are usually multi-dimensional tables, called DataFrames.
Series is like a column, a DataFrame is the whole table.

### Creating DataFrame

There are various ways to create Data frames (DF) in pandas, including using
arrays and series.

We will explore each method in this notebook, but you will notice that the DF
class requires several optional parameters, such as index, column names, and
others.

However, one parameter is mandatoryâ€” the data itself (either an array or a
series)

#### Creating Dataframe from Array

It's essential to note that the length of the index list should match the number of rows
in the data, and similarly, the length of the columns list should correspond to the
number of columns in the provided data.

In [39]:
import pandas as pd
import numpy as np

#actual data
Data=np.array([[1,2,3,4],
               [5,6,7,8],
               [9,10,11,12],
               [13,14,15,16]])

#Naming rows
Row=np.array(["Row1","Row2","Row3","Row4"])
#Naming columns
Column=np.array(["Col1","Col2","Col3","Col4"])
#Creating dataframe
DataFrame=pd.DataFrame(Data, index=Row, columns=Column)

print(DataFrame)

      Col1  Col2  Col3  Col4
Row1     1     2     3     4
Row2     5     6     7     8
Row3     9    10    11    12
Row4    13    14    15    16


#### Creating Dataframe from list

In [42]:
import pandas as pd

Data=[["Sumit pandey",20,"Hotel Manager"],
      ["Shubham Kanojiya",24,"Clinic owner"],
      ["Akshy Maurya",25,"Production Manger"],
      ["Ayush Kharawar",30,"Police officer"],
      ["Sarvesh Maurya",29,"ML Engeener"]]

#define column name 
Column_name=["Name","age","Proffesion"]
#creating data frame
DataFrame=pd.DataFrame(Data, columns= Column_name)

print(DataFrame)

               Name  age         Proffesion
0      Sumit pandey   20      Hotel Manager
1  Shubham Kanojiya   24       Clinic owner
2      Akshy Maurya   25  Production Manger
3    Ayush Kharawar   30     Police officer
4    Sarvesh Maurya   29        ML Engeener


#### creating Dataframe from Series

As previously mentioned, a Series is essentially a 1D matrix. If you have multiple
Series, you can combine them to create a DataFrame.

In [43]:
import pandas as pd

w=pd.Series({'A':1,'B':2,'C':3,'D':4})
x=pd.Series({'A':5,'B':6,'C':7,'D':8})
y=pd.Series({'A':9,'B':10,'C':11,'D':12})
z=pd.Series({'A':13,'B':14,'C':15,'D':16})

df=pd.DataFrame({'a':w,'b':x,'c':y,'d':z})
print(df)

   a  b   c   d
A  1  5   9  13
B  2  6  10  14
C  3  7  11  15
D  4  8  12  16
