# Pandas
- Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
- It is powerfull library work on data frames that have 'Relational' or 'labeled' data.
- Its aim aligns with doing real-world data analaysis using python.

### Types of Data Structures:
1. Series: One-dimensional labeled array
2. DataFrame: Two-dimensional labeled data structure (like a table)

### to install pandas
- !pip install pandas

In [1]:
import numpy as np
import pandas as pd

# Series
- Series is a One-Dimensional labeled array capable of holding data of any type(integer, string, float, python object, etc.)
- The axis labels are collectively called index.
- A pandas Series can be created using the constructor:
    - pandas.Series(data, index, dtype)

### Creating Series
1. From a List

In [3]:
s=pd.Series()
print(s)

Series([], dtype: object)


In [4]:
s_list=pd.Series(data=[1,2,3,4,5,6,7])
s_list

0    1
1    2
2    3
3    4
4    5
5    6
6    7
dtype: int64

In [11]:
s_new_one=pd.Series(data=[1,2,3,4,5,6,7],index=[100,34,56,456,89,234,103])
s_new_one

100    1
34     2
56     3
456    4
89     5
234    6
103    7
dtype: int64

In [6]:
# if you want to pass index then length must be sam eas data
s_list_index=pd.Series([10,20,30,40],index=[101,102,103,104])
s_list_index

101    10
102    20
103    30
104    40
dtype: int64

In [3]:
new_list=pd.Series([2,3,4,5,6,7,8,9],index=[1,2,3,4,5,6,7,8])
new_list

1    2
2    3
3    4
4    5
5    6
6    7
7    8
8    9
dtype: int64

2. From a Dictionary

In [8]:
s_dict=pd.Series({
    "a":12,
    "b":15,
    "c":5,
    "d":30
},index=['d','b','c','a'])
s_dict

d    30
b    15
c     5
a    12
dtype: int64

3. From NumPy Array

In [15]:
data = np.array(['a','b','c','d','e'])
s_array =pd.Series(data,index=[101,102,103,104,105])
s_array

101    a
102    b
103    c
104    d
105    e
dtype: object

In [11]:
pd.Series(5,index=range(0,10))

0    5
1    5
2    5
3    5
4    5
5    5
6    5
7    5
8    5
9    5
dtype: int64

In [8]:
thisdict =pd.Series({
  "brand": "Ford",
  "electric": False,
  "year": 1964,
  "colors": ["red", "white", "blue" ] })
thisdict

brand                     Ford
electric                 False
year                      1964
colors      [red, white, blue]
dtype: object

In [11]:
new=[1,2,3,4,5,6]
new=np.array(new)
new=pd.Series(new,index=['a','b','c','d','e','f'])
new

a    1
b    2
c    3
d    4
e    5
f    6
dtype: int64

# Accessing and Indexing
- .loc = it use labled of structures
- .iloc =it use default index value


In [16]:
s_array

101    a
102    b
103    c
104    d
105    e
dtype: object

In [18]:
s_new_one


100    1
34     2
56     3
456    4
89     5
234    6
103    7
dtype: int64

In [19]:
s_new_one.iloc[2]

np.int64(3)

In [21]:
s_new_one.loc[34]

np.int64(2)

In [16]:
s_array.iloc[2]

'c'

In [17]:
s_array.loc[103]

'c'

In [18]:
s_array.loc[[102,105]] #fancy indexing

102    b
105    e
dtype: object

In [22]:
s_array[102]

'b'

In [12]:
new=[1,2,3,4,5,6]
new=np.array(new)
new=pd.Series(new,index=['a','b','c','d','e','f'])
new

a    1
b    2
c    3
d    4
e    5
f    6
dtype: int64

In [15]:
new.iloc[2]

np.int64(3)

In [17]:
new.loc['c']

np.int64(3)

In [18]:
new.iloc[5]

np.int64(6)

# DataFrame
- A DataFrame is a two-dimensional data structure.
- i.e., data is aligned in a tabular format in rows and columns.
- Potentially columns are of different types:
    - size: mutable
    - labeled axes(rows and columns)
    - can perform arithmetic operations on rows and columns wise.
- pandas DataFrame can be created using constructor:
    - pandas.DataFrame(data, index, columns,dtype)

### Creating DataFrames
1. From Dictionary of Lists

In [22]:
data=[[10,10,10,40,50],['ram','hari','shyam','alex','bob']]

In [23]:
data

[[10, 10, 10, 40, 50], ['ram', 'hari', 'shyam', 'alex', 'bob']]

In [24]:
pd.DataFrame(data,columns=[2017,2018,2019,2020,2021],index=['marks','name'])

Unnamed: 0,2017,2018,2019,2020,2021
marks,10,10,10,40,50
name,ram,hari,shyam,alex,bob


# Using Dictonary

In [35]:
data={
    "names":['ram','hari','shyam','alex','bob'],
    "marks":[10,10,10,40,50]
}
df=pd.DataFrame(data)
df

Unnamed: 0,names,marks
0,ram,10
1,hari,10
2,shyam,10
3,alex,40
4,bob,50


In [36]:
type(df['names'])

pandas.core.series.Series

In [38]:
df['names']

0      ram
1     hari
2    shyam
3     alex
4      bob
Name: names, dtype: object

In [39]:
df[['names','marks']]

Unnamed: 0,names,marks
0,ram,10
1,hari,10
2,shyam,10
3,alex,40
4,bob,50


In [42]:
df.iloc[[0,3],0]

0     ram
3    alex
Name: names, dtype: object

In [45]:
df.loc[[0,3],['names']]

Unnamed: 0,names
0,ram
3,alex


In [48]:
#display data having marks above 30
df[df['marks']>30]

Unnamed: 0,names,marks
3,alex,40
4,bob,50


In [52]:
print("dimension :", df.ndim)
print("shape :", df.shape)
print("length :",len(df))

dimension : 2
shape : (5, 2)
length : 5


In [53]:
# Slicing
df.loc[1:4]

Unnamed: 0,names,marks
1,hari,10
2,shyam,10
3,alex,40
4,bob,50


In [30]:
s_list=pd.Series(data=[1,2,3,4,5,6,7])
s_list
df=pd.DataFrame(s_list)
df

Unnamed: 0,0
0,1
1,2
2,3
3,4
4,5
5,6
6,7


In [31]:
data = np.array(['a','b','c','d','e'])
s_array =pd.Series(data,index=[101,102,103,104,105])
s_array
df=pd.DataFrame(s_array)
df

Unnamed: 0,0
101,a
102,b
103,c
104,d
105,e


# Accessing

In [19]:
#DF through series , array , list of dict
new=[1,2,3,4,5,6]
new=np.array(new)
new=pd.Series(new,index=['a','b','c','d','e','f'])
new=pd.DataFrame(new)
new

Unnamed: 0,0
a,1
b,2
c,3
d,4
e,5
f,6


In [25]:
new = [
    {"name": "Alice", 
     "age": 30,
     "city": "New York"
    },
    {"name": "Bob",
     "age": 25, 
     "city": "Los Angeles"
    },
    {"name": "Charlie"
     , "age": 35, 
     "city": "Chicago"}
]
df=pd.DataFrame(new)
df

Unnamed: 0,name,age,city
0,Alice,30,New York
1,Bob,25,Los Angeles
2,Charlie,35,Chicago


In [24]:
data = [
    {"name": "Alice", "age": 25, "city": "New York"},
    {"name": "Bob", "age": 30, "city": "San Francisco"},
    {"name": "Charlie", "age": 35, "city": "Chicago"}
]

# Convert to DataFrame
df = pd.DataFrame(data)

df

Unnamed: 0,name,age,city
0,Alice,25,New York
1,Bob,30,San Francisco
2,Charlie,35,Chicago
