<a href="https://colab.research.google.com/github/kunal-geeks/kunal-geeks/blob/main/Python_Session_Number_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Modules
Modules refer to a file containing Python statements and definitions.

We can define our most used functions in a module and import it, instead of copying their definitions into different programs.

We use modules to break down large programs into small manageable and organized files. Furthermore, modules provide reusability of code.

## How to import modules in Python?

We use the **import** keyword to do this

**EXAMPLE:** import math

### Python import statement
We can import a module using the import statement and access the definitions inside it using the dot operator

In [None]:
import math
print("The value of pi is", math.pi)

### Import with renaming


In [None]:
import math as m
print("The value of pi is", m.pi)

We have renamed the math module as m. This can save us typing time in some cases.

Note that the name math is not recognized in our scope. Hence, math.pi is invalid, and m.pi is the correct implementation.

## Python from...import statement
We can import specific names from a module without importing the module as a whole. 

In [None]:
from math import pi
print("The value of pi is", pi)

Here, we imported only the **pi** attribute **from** the **math module**.

In such cases, we don't use the **dot** operator. We can also import multiple attributes as follows:

In [None]:
from math import pi, e

## Import all names
We can import all names(definitions) from a module using the following construct:

In [None]:
from math import *
print("The value of pi is", pi)

Here, we have imported all the definitions from the math module. This includes all names visible in our scope except those beginning with an underscore(private definitions).

Importing everything with the asterisk (*) symbol is not a good programming practice. This can lead to duplicate definitions for an identifier. It also hampers the readability of our code.

______

# NUMPY

* NumPy is a Python library, it is used for working with arrays and is short for "Numerical Python".

* In Python we have lists that serve the purpose of arrays, but they are slow to process(NumPy arrays are stored at one continuous place in memory unlike lists, so processes can 
  access and manipulate them very efficiently.)

* NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

* The array object in NumPy is called ndarray

## Import Numpy

In [None]:
import numpy

## NumPy Creating Arrays

* NumPy is used to work with arrays. The array object in NumPy is called ndarray.

* We can create a NumPy ndarray object by using the array() function.

* We can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray:



In [None]:
import numpy

arr = numpy.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


NumPy is usually imported under the np alias.

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


## Dimensions in Arrays

0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

In [None]:
import numpy as np

arr = np.array(42)

print(arr)

42


1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

[[1 2 3]
 [4 5 6]]


## Check Number of Dimensions?
NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have.

In [None]:
import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])

print(a.ndim)
print(b.ndim)
print(c.ndim)

0
1
2


## Slicing arrays
* Slicing in python means taking elements from one given index to another given index.

* Syntax: [start:end].

* Syntax: [start:end:step].

* If we don't pass start its considered 0

* If we don't pass end its considered length of array in that dimension

* If we don't pass step its considered 1

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

[2 3 4 5]


In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[-3:-1])

[5 6]


In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5:2])

[2 4]


## Checking the Data Type of an Array
The NumPy array object has a property called dtype that returns the data type of the array:

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr.dtype)


int32


## Get the Shape of an Array
NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.

In [None]:
import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr)
print(arr.shape)

[[1 2 3 4]
 [5 6 7 8]]
(2, 4)


## Iterating Arrays

In [None]:
import numpy as np

arr = np.array([1, 2, 3])

for x in arr:
  print(x)

1
2
3


In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
  print(x)

[1 2 3]
[4 5 6]


## Joining NumPy Arrays

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)

[1 2 3 4 5 6]


## Splitting NumPy Arrays

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2])

[1 2]
[3 4]
[5 6]


## Searching Arrays

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x)

(array([3, 5, 6], dtype=int64),)


In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

x = np.where(arr%2 == 0)

print(x)


(array([1, 3, 5, 7], dtype=int64),)


## Sorting Arrays

In [None]:
import numpy as np

arr = np.array([3, 2, 0, 1])

print(np.sort(arr))

[0 1 2 3]


In [None]:
import numpy as np

arr = np.array(['banana', 'cherry', 'apple'])

print(np.sort(arr))

['apple' 'banana' 'cherry']


## NumPy Filter Array

In [None]:
import numpy as np

arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

[False False  True  True]
[43 44]


# PANDAS

Pandas is an open-source library that is made mainly for working with relational or labeled data both easily and intuitively. It provides various data structures and operations for manipulating numerical data and time series. This library is built on top of the NumPy library. Pandas is fast and it has high performance & productivity for users.

* Fast and efficient for manipulating and analyzing data.
* Data from different file objects can be loaded.
* Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data
* Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects
* Data set merging and joining.
* Flexible reshaping and pivoting of data sets
* Provides time-series functionality.
* Powerful group by functionality for performing split-apply-combine operations on data sets.

## Import Pandas

In [None]:
import pandas as pd

## Pandas Series
It is a one-dimensional array holding data of any type.

In [None]:
import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a)

print(myvar)

0    1
1    7
2    2
dtype: int64


In [None]:
import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)

day1    420
day2    380
day3    390
dtype: int64


## Labels
If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1 etc.

This label can be used to access a specified value.

In [None]:
print(myvar[0])

1


Create your own labels:

In [None]:
import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"])

print(myvar)

x    1
y    7
z    2
dtype: int64


##                                                                     DataFrames

Data sets in Pandas are usually multi-dimensional tables, called DataFrames.

Series is like a column, a DataFrame is the whole table.

## Create an Empty DataFrame
A basic DataFrame, which can be created is an Empty Dataframe.

In [None]:
import pandas as pd
df = pd.DataFrame()
print(df)

Empty DataFrame
Columns: []
Index: []


## Create a DataFrame from Lists
The DataFrame can be created using a single list or a list of lists.

In [None]:
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)

   0
0  1
1  2
2  3
3  4
4  5


In [None]:
data = [['Alex',10,'10'],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age','Marks'])
print(df)


     Name  Age Marks
0    Alex   10    10
1     Bob   12  None
2  Clarke   13  None


**Pandas use the loc attribute to return one or more specified row(s)**

In [None]:
print(df.loc[0])

Name     Alex
Age        10
Marks      10
Name: 0, dtype: object


## Create a DataFrame from Dictionary

In [None]:
import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

myvar = pd.DataFrame(data)

print(myvar),


   calories  duration
0       420        50
1       380        40
2       390        45


(None,)

**Note** − Observe, **NaN** (Not a Number) is appended in missing areas

In [None]:
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print(df)

   a   b     c
0  1   2   NaN
1  5  10  20.0


In [None]:
import pandas as pd

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
   'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)
print(df)

   one  two
a  1.0    1
b  2.0    2
c  3.0    3
d  NaN    4


In [None]:
#refer to the named index:
print(df.loc["a"])

one    1.0
two    1.0
Name: a, dtype: float64


# pandas.DataFrame.fillna
DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)

In [None]:
import pandas as pd
import numpy as np
df = pd.DataFrame([[np.nan, 2, np.nan, 0],
                   [3, 4, np.nan, 1],
                   [np.nan, np.nan, np.nan, np.nan],
                   [np.nan, 3, np.nan, 4]],
                  columns=list("ABCD"))

x = df.fillna(0)
print(df)
print(x)

     A    B   C    D
0  NaN  2.0 NaN  0.0
1  3.0  4.0 NaN  1.0
2  NaN  NaN NaN  NaN
3  NaN  3.0 NaN  4.0
     A    B    C    D
0  0.0  2.0  0.0  0.0
1  3.0  4.0  0.0  1.0
2  0.0  0.0  0.0  0.0
3  0.0  3.0  0.0  4.0


Replace Using Mean, Median, or Mode

In [None]:
import pandas as pd

df = pd.DataFrame([[np.nan, 2, np.nan, 0],
                   [3, 4, np.nan, 1],
                   [np.nan, np.nan, np.nan, np.nan],
                   [np.nan, 3, np.nan, 4]],
                  columns=list("ABCD"))

x = df["A"].mean()

df["A"].fillna(x, inplace = True)
print(df)

     A    B   C    D
0  3.0  2.0 NaN  0.0
1  3.0  4.0 NaN  1.0
2  3.0  NaN NaN  NaN
3  3.0  3.0 NaN  4.0


# pandas.DataFrame.dropna
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)

In [None]:
df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
                   "toy": [np.nan, 'Batmobile', 'Bullwhip'],
                   "born": [pd.NaT, pd.Timestamp("1940-04-25"),
                            pd.NaT]})
print(df)
df1 = df.dropna()
print("")
print(df1)

       name        toy       born
0    Alfred        NaN        NaT
1    Batman  Batmobile 1940-04-25
2  Catwoman   Bullwhip        NaT

     name        toy       born
1  Batman  Batmobile 1940-04-25


In [None]:
df.dropna(subset=['toy'], inplace = True)
print(df)

       name        toy       born
1    Batman  Batmobile 1940-04-25
2  Catwoman   Bullwhip        NaT


## Removing Duplicates

In [None]:
df.drop_duplicates(inplace = True)

# pandas.DataFrame.set_index
DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

In [None]:
df = pd.DataFrame({'month': [1, 4, 7, 10],
                   'year': [2012, 2014, 2013, 2014],
                   'sale': [55, 40, 84, 31]})
print(df)
df.set_index('month')


   month  year  sale
0      1  2012    55
1      4  2014    40
2      7  2013    84
3     10  2014    31


Unnamed: 0_level_0,year,sale
month,Unnamed: 1_level_1,Unnamed: 2_level_1
1,2012,55
4,2014,40
7,2013,84
10,2014,31


# pandas.DataFrame.reset_index
DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')

In [None]:
df = pd.DataFrame([('bird', 389.0),
                   ('bird', 24.0),
                   ('mammal', 80.5),
                   ('mammal', np.nan)],
                  index=['falcon', 'parrot', 'lion', 'monkey'],
                  columns=('class', 'max_speed'))
print(df)

         class  max_speed
falcon    bird      389.0
parrot    bird       24.0
lion    mammal       80.5
monkey  mammal        NaN


In [None]:
df.reset_index()
df.reset_index(drop=True)

Unnamed: 0,class,max_speed
0,bird,389.0
1,bird,24.0
2,mammal,80.5
3,mammal,


# Reading CSV files as DataFrames

In [None]:
import pandas as pdi

data = pd.read_csv('C:/Users/shaimishra/Downloads/student.csv')
print(data)

    id         name  class  mark  gender
0    1     John Deo   Four    75  female
1    2     Max Ruin  Three    85    male
2    3       Arnold  Three    55    male
3    4   Krish Star   Four    60  female
4    5    John Mike   Four    60  female
5    6    Alex John   Four    55    male
6    7  My John Rob  Fifth    78    male
7    8       Asruid   Five    85    male
8    9      Tes Qry    Six    78    male
9   10     Big John   Four    55  female
10  11       Ronald    Six    89  female
11  12        Recky    Six    94  female
12  13          Kty  Seven    88  female
13  14         Bigy  Seven    88  female
14  15     Tade Row   Four    88    male
15  16        Gimmy   Four    88    male
16  17        Tumyu    Six    54    male
17  18        Honny   Five    75    male
18  19        Tinny   Nine    18    male


In [None]:
len(data)

19

In [None]:
data.shape

(19, 5)

In [None]:
data.head(6)

Unnamed: 0,id,name,class,mark,gender
0,1,John Deo,Four,75,female
1,2,Max Ruin,Three,85,male
2,3,Arnold,Three,55,male
3,4,Krish Star,Four,60,female
4,5,John Mike,Four,60,female
5,6,Alex John,Four,55,male


In [None]:
data.tail()

Unnamed: 0,id,name,class,mark,gender
14,15,Tade Row,Four,88,male
15,16,Gimmy,Four,88,male
16,17,Tumyu,Six,54,male
17,18,Honny,Five,75,male
18,19,Tinny,Nine,18,male


In [None]:
data.describe()

Unnamed: 0,id,mark
count,19.0,19.0
mean,10.0,72.0
std,5.627314,19.177533
min,1.0,18.0
25%,5.5,57.5
50%,10.0,78.0
75%,14.5,88.0
max,19.0,94.0


In [None]:
data.dtypes

id         int64
name      object
class     object
mark       int64
gender    object
dtype: object

In [None]:
data.columns

Index(['id', 'name', 'class', 'mark', 'gender'], dtype='object')

In [None]:
data['mark']>70

0      True
1      True
2     False
3     False
4     False
5     False
6      True
7      True
8      True
9     False
10     True
11     True
12     True
13     True
14     True
15     True
16    False
17     True
18    False
Name: mark, dtype: bool

In [None]:
data_above70 = data[data['mark']>70]
print(data_above70)

    id         name  class  mark  gender
0    1     John Deo   Four    75  female
1    2     Max Ruin  Three    85    male
6    7  My John Rob  Fifth    78    male
7    8       Asruid   Five    85    male
8    9      Tes Qry    Six    78    male
10  11       Ronald    Six    89  female
11  12        Recky    Six    94  female
12  13          Kty  Seven    88  female
13  14         Bigy  Seven    88  female
14  15     Tade Row   Four    88    male
15  16        Gimmy   Four    88    male
17  18        Honny   Five    75    male


In [None]:
data_above70.head()

Unnamed: 0,id,name,class,mark,gender
0,1,John Deo,Four,75,female
1,2,Max Ruin,Three,85,male
6,7,My John Rob,Fifth,78,male
7,8,Asruid,Five,85,male
8,9,Tes Qry,Six,78,male


In [None]:
import pandas as pd
   
# creating a sample dataframe
data = pd.DataFrame({'Brand' : ['Maruti', 'Hyundai', 'Tata',
                                'Mahindra', 'Maruti', 'Hyundai',
                                'Renault', 'Tata', 'Maruti'],
                     'Year' : [2012, 2014, 2011, 2015, 2012, 
                               2016, 2014, 2018, 2019],
                     'Kms Driven' : [50000, 30000, 60000, 
                                     25000, 10000, 46000, 
                                     31000, 15000, 12000],
                     'City' : ['Gurgaon', 'Delhi', 'Mumbai', 
                               'Delhi', 'Mumbai', 'Delhi', 
                               'Mumbai','Chennai',  'Ghaziabad'],
                     'Mileage' :  [28, 27, 25, 26, 28, 
                                   29, 24, 21, 24]})
   
# displaying the DataFrame
display(data)

data.to_csv('Car_Details.csv')

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
0,Maruti,2012,50000,Gurgaon,28
1,Hyundai,2014,30000,Delhi,27
2,Tata,2011,60000,Mumbai,25
3,Mahindra,2015,25000,Delhi,26
4,Maruti,2012,10000,Mumbai,28
5,Hyundai,2016,46000,Delhi,29
6,Renault,2014,31000,Mumbai,24
7,Tata,2018,15000,Chennai,21
8,Maruti,2019,12000,Ghaziabad,24


# loc VS iloc

## loc in Pandas
loc is label-based, which means that we have to specify the name of the rows and columns that we need to filter out.

For example, let’s say we search for the rows whose index is 1, 2 or 100. We will not get the first, second or the hundredth row here. Instead, we will get the results only if the name of any index is 1, 2 or 100.

So, we can filter the data using the loc function in Pandas even if the indices are not an integer in our dataset.

## iloc in Pandas
On the other hand, iloc is integer index-based. So here, we have to specify rows and columns by their integer index.

Let’s say we search for the rows with index 1, 2 or 100. It will return the first, second and hundredth row, regardless of the name or labels we have in the index in our dataset.

# EXAMPLE 1

In [None]:
import pandas as pd
   
# creating a sample dataframe
data = pd.DataFrame({'Brand' : ['Maruti', 'Hyundai', 'Tata',
                                'Mahindra', 'Maruti', 'Hyundai',
                                'Renault', 'Tata', 'Maruti'],
                     'Year' : [2012, 2014, 2011, 2015, 2012, 
                               2016, 2014, 2018, 2019],
                     'Kms Driven' : [50000, 30000, 60000, 
                                     25000, 10000, 46000, 
                                     31000, 15000, 12000],
                     'City' : ['Gurgaon', 'Delhi', 'Mumbai', 
                               'Delhi', 'Mumbai', 'Delhi', 
                               'Mumbai','Chennai',  'Ghaziabad'],
                     'Mileage' :  [28, 27, 25, 26, 28, 
                                   29, 24, 21, 24]})
   
# displaying the DataFrame
display(data)

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
0,Maruti,2012,50000,Gurgaon,28
1,Hyundai,2014,30000,Delhi,27
2,Tata,2011,60000,Mumbai,25
3,Mahindra,2015,25000,Delhi,26
4,Maruti,2012,10000,Mumbai,28
5,Hyundai,2016,46000,Delhi,29
6,Renault,2014,31000,Mumbai,24
7,Tata,2018,15000,Chennai,21
8,Maruti,2019,12000,Ghaziabad,24


In [None]:
display(data.loc[(data['Brand'] == 'Maruti') & (data['Mileage'] > 25)])

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
0,Maruti,2012,50000,Gurgaon,28
4,Maruti,2012,10000,Mumbai,28


In [None]:

# selecting range of rows from 2 to 5
display(data.loc[2 : 5])

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
2,Tata,2011,60000,Mumbai,25
3,Mahindra,2015,25000,Delhi,26
4,Maruti,2012,10000,Mumbai,28
5,Hyundai,2016,46000,Delhi,29


In [None]:

# updating values of Mileage if Year < 2015
data.loc[(data.Year < 2015), ['Mileage']] = 22
display(data)

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
0,Maruti,2012,50000,Gurgaon,22
1,Hyundai,2014,30000,Delhi,22
2,Tata,2011,60000,Mumbai,22
3,Mahindra,2015,25000,Delhi,26
4,Maruti,2012,10000,Mumbai,22
5,Hyundai,2016,46000,Delhi,29
6,Renault,2014,31000,Mumbai,22
7,Tata,2018,15000,Chennai,21
8,Maruti,2019,12000,Ghaziabad,24


In [None]:

# selecting 0th, 2th, 4th, and 7th index rows
display(data.iloc[[0, 2, 4, 7]])

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
0,Maruti,2012,50000,Gurgaon,22
2,Tata,2011,60000,Mumbai,22
4,Maruti,2012,10000,Mumbai,22
7,Tata,2018,15000,Chennai,21


In [None]:

# selecting rows from 1 to 4 and columns from 2 to 4
display(data.iloc[1 : 5, 2 : 5])

Unnamed: 0,Kms Driven,City,Mileage
1,30000,Delhi,22
2,60000,Mumbai,22
3,25000,Delhi,26
4,10000,Mumbai,22


# EXAMPLE 2

In [None]:
import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4],
                   'assists': [11, 8, 10, 6, 6, 5, 9, 12]},
                   index=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'])

#view DataFrame
df


Unnamed: 0,team,points,assists
A,A,5,11
B,A,7,8
C,A,7,10
D,A,9,6
E,B,12,6
F,B,9,5
G,B,9,9
H,B,4,12


In [None]:
#select rows with index labels 'E' and 'F'
df.loc[['E', 'F']]

Unnamed: 0,team,points,assists
E,B,12,6
F,B,9,5


In [None]:
#select 'E' and 'F' rows and 'team' and 'assists' columns
df.loc[['E', 'F'], ['team', 'assists']]

Unnamed: 0,team,assists
E,B,6
F,B,5


In [None]:
#select rows in index positions 4 through 6 (not including 6)
df.iloc[4:6]

Unnamed: 0,team,points,assists
E,B,12,6
F,B,9,5


In [None]:
#select rows in range 4 through 6 and columns in range 0 through 2
df.iloc[4:6, 0:2]

Unnamed: 0,team,points
E,B,12
F,B,9


# EXAMPLE 3

In [None]:
s = pd.Series(list("abcdef"), index=[49, 48, 47, 0, 1, 2]) 
print(s)

49    a
48    b
47    c
0     d
1     e
2     f
dtype: object


In [None]:
 s.loc[0]    # value at index label 0

'd'

In [None]:
s.iloc[0]    # value at index location 0

'a'

# to_dict()

In [None]:
import pandas as pd

data = pd.DataFrame({'Brand' : ['Maruti', 'Hyundai', 'Tata',
                                'Mahindra', 'Maruti', 'Hyundai',
                                'Renault', 'Tata', 'Maruti'],
                     'Year' : [2012, 2014, 2011, 2015, 2012, 
                               2016, 2014, 2018, 2019],
                     'Kms Driven' : [50000, 30000, 60000, 
                                     25000, 10000, 46000, 
                                     31000, 15000, 12000],
                     'City' : ['Gurgaon', 'Delhi', 'Mumbai', 
                               'Delhi', 'Mumbai', 'Delhi', 
                               'Mumbai','Chennai',  'Ghaziabad'],
                     'Mileage' :  [28, 27, 25, 26, 28, 
                                   29, 24, 21, 24]})

In [None]:
display(data)
result = data.to_dict(orient='records')
print(result)

Unnamed: 0,Brand,Year,Kms Driven,City,Mileage
0,Maruti,2012,50000,Gurgaon,28
1,Hyundai,2014,30000,Delhi,27
2,Tata,2011,60000,Mumbai,25
3,Mahindra,2015,25000,Delhi,26
4,Maruti,2012,10000,Mumbai,28
5,Hyundai,2016,46000,Delhi,29
6,Renault,2014,31000,Mumbai,24
7,Tata,2018,15000,Chennai,21
8,Maruti,2019,12000,Ghaziabad,24


[{'Brand': 'Maruti', 'Year': 2012, 'Kms Driven': 50000, 'City': 'Gurgaon', 'Mileage': 28}, {'Brand': 'Hyundai', 'Year': 2014, 'Kms Driven': 30000, 'City': 'Delhi', 'Mileage': 27}, {'Brand': 'Tata', 'Year': 2011, 'Kms Driven': 60000, 'City': 'Mumbai', 'Mileage': 25}, {'Brand': 'Mahindra', 'Year': 2015, 'Kms Driven': 25000, 'City': 'Delhi', 'Mileage': 26}, {'Brand': 'Maruti', 'Year': 2012, 'Kms Driven': 10000, 'City': 'Mumbai', 'Mileage': 28}, {'Brand': 'Hyundai', 'Year': 2016, 'Kms Driven': 46000, 'City': 'Delhi', 'Mileage': 29}, {'Brand': 'Renault', 'Year': 2014, 'Kms Driven': 31000, 'City': 'Mumbai', 'Mileage': 24}, {'Brand': 'Tata', 'Year': 2018, 'Kms Driven': 15000, 'City': 'Chennai', 'Mileage': 21}, {'Brand': 'Maruti', 'Year': 2019, 'Kms Driven': 12000, 'City': 'Ghaziabad', 'Mileage': 24}]


## Important Topics for Pandas
* isnull
* replace
* filter
* rename
* sort values
* groupby
* apply
* apply map
* map
* append
* concat
* join
* merge
* pivot table
* .any()
* .all()
* subset a dataframe with multiple conditions
* transpose 
* to datetime
* drop duplicates
* isin

