## <u>Agenda</u>

1. Python Concepts Recap
    - Data structures
    - Control Statements
    - Loops
    - Functional Programming
2. Problems on Python Concepts
3. Pandas and Numpy Recap
    - Pandas Data Structure and initial analysis
    - Slicing, Filtering, Sorting
4. Problems on Pandas and Numpy

### Data Structures and Loops

![image.png](attachment:image.png)

***Lists***

Python has a great built-in list type named "list". List literals are written within square brackets [ ]. Lists work similarly to strings -- use the len() function and square brackets [ ] to access data, with the first element at index 0. 

List Methods
Here are some other common list methods.

1. list.append(elem) -- adds a single element to the end of the list. Common error: does not return the new list, just modifies the original.
2. list.insert(index, elem) -- inserts the element at the given index, shifting elements to the right.
3. list.extend(list2) adds the elements in list2 to the end of the list. Using + or += on a list is similar to using extend().
4. list.index(elem) -- searches for the given element from the start of the list and returns its index. Throws a ValueError if the element does not appear (use "in" to check without a ValueError).
5. list.remove(elem) -- searches for the first instance of the given element and removes it (throws ValueError if not present)
6. list.sort() -- sorts the list in place (does not return it). (The sorted() function shown later is preferred.)
7. list.reverse() -- reverses the list in place (does not return it)
8. list.pop(index) -- removes and returns the element at the given index. Returns the rightmost element if index is omitted (roughly the opposite of append()).

Notice that these are *methods* on a list object, while len() is a function that takes the list (or string or whatever) as an argument.

In [34]:
# lists
groceries = ['Tomato', 'Potato', 'Chilli', 'Onion']
print(groceries)   

['Tomato', 'Potato', 'Chilli', 'Onion']


In [35]:
# accessing elements of list
groceries[-1]

'Onion'

In [36]:
# adding into lists
groceries.append('Cabbage')
print(groceries)      

['Tomato', 'Potato', 'Chilli', 'Onion', 'Cabbage']


In [38]:
fruits = ['apple', 'grapes']
groceries.extend(fruits)
groceries

['Tomato',
 'Potato',
 'Chilli',
 'Onion',
 'Cabbage',
 ['apple', 'grapes'],
 'apple',
 'grapes']

In [39]:
# replacing elements in lists
groceries[2] = 'Garlic'
print(groceries)       

['Tomato', 'Potato', 'Garlic', 'Onion', 'Cabbage', ['apple', 'grapes'], 'apple', 'grapes']


In [None]:
# looping through lists
for i in groceries:
    print(i)

In [None]:
# list size 
print(len(groceries))  

In [None]:
fruits = ['Apple', 'Grapes']

# adding a sublist
groceries.extend(fruits)  
print(groceries)

In [41]:
# delete the last element
del groceries[-2]       
print(groceries)

['Tomato', 'Potato', 'Garlic', 'Onion', 'Cabbage', 'apple']


In [42]:
# delete the list
del groceries           
print(groceries)

NameError: name 'groceries' is not defined

***Tuples***

Python Tuple is a collection of objects separated by commas. In some ways, a tuple is similar to a list in terms of indexing, nested objects, and repetition but a tuple is immutable, unlike lists which are mutable.

In [None]:
tpl=() 
type(tpl)

# string tuple
tup = ('James', 'Harry', 'Mary', 'John')
print(tup)

# int tuple
tup = (10, 20, 30 ,40)
print(tup)

In [None]:
# heterogeneous tuple
tup = ('John', 10, 'Mary', 20, 'Harry', 30)
tup

In [43]:
# without enclosing within brackets
tup = 'James', 'Harry', 'Mary', 'John'
print(type(tup))

list_of_names = list(tup)
list_of_names

<class 'tuple'>


['James', 'Harry', 'Mary', 'John']

In [46]:
names = ('Jeff', 'Bill', 'Steve', 'Yash') 
names[0]

# negative index
names[-2]

# index out of range error
# names[8]

len(names)

4

In [48]:
# modifying tuple elements
names = ('Jeff', 'Bill', 'Steve', 'Yash') 

names = list(names)
names[0] = 'Swati'

In [50]:
names = tuple(names)
names

('Swati', 'Bill', 'Steve', 'Yash')

In [None]:
# delete tuple elements
del names[0]

***Sets***

Sets are mutable. However, since they are unordered, indexing has no meaning.

We cannot access or change an element of a set using indexing or slicing. Set data type does not support it.

We can add a single element using the add() method, and multiple elements using the update() method. The update() method can take tuples, lists, strings or other sets as its argument. In all cases, duplicates are avoided.

In [53]:
num = set()
type(num)

set

In [55]:
num = {20, 50, 10, 40, 30, 20, 20}
num

{10, 20, 30, 40, 50}

In [56]:
# Using add() method   
months = {'January', 'February', 'March', 'April', 'May'}
months.add('June')
months

{'April', 'February', 'January', 'June', 'March', 'May'}

In [59]:
# Using update() function 
months.update(['October', 123])
months

{123,
 'April',
 'D',
 'December',
 'February',
 'January',
 'June',
 'March',
 'May',
 'N',
 'November',
 'October',
 'b',
 'c',
 'e',
 'm',
 'o',
 'r',
 'v'}

In [60]:
# Using discard() method 
months.discard('November')
months

{123,
 'April',
 'D',
 'December',
 'February',
 'January',
 'June',
 'March',
 'May',
 'N',
 'October',
 'b',
 'c',
 'e',
 'm',
 'o',
 'r',
 'v'}

In [62]:
# Using remove() function
months.remove('April')
months

KeyError: 'April'

In [64]:
# Union of two Sets
set1 = {'Monday', 'Tuesday', 'Wednesday', 'Thursday'}
set2 = {'Friday', 'Saturday', 'Sunday'}
set1|set2

{'Friday', 'Monday', 'Saturday', 'Sunday', 'Thursday', 'Tuesday', 'Wednesday'}

In [65]:
#Intersection of two sets
set1 = {'Monday', 'Tuesday', 'Wednesday', 'Thursday'}
set2 = {'Friday', 'Saturday', 'Sunday', 'Monday'}

set1&set2

{'Monday'}

In [66]:
# Difference between the two sets
set1 = {'Monday', 'Tuesday', 'Wednesday', 'Thursday'}
set2 = {'Friday', 'Saturday', 'Sunday', 'Monday'}
set2-set1

{'Friday', 'Saturday', 'Sunday'}

***Dictionary***

Python dictionary is an unordered collection of items. Each item of a dictionary has a key/value pair.

Dictionaries are optimized to retrieve values when the key is known.


In [68]:
# dictionary
groceries = {'Tomato': 5, 'Potato': 3, 'Chilli': 10, 'Onion': 7}
print(groceries)


{'Tomato': 5, 'Potato': 3, 'Chilli': 10, 'Onion': 7}


In [69]:
# accessing values for specific keys
print(groceries['Chilli'])

10


In [70]:
# adding items into dictionary
groceries['Cabbage'] = 1
print(groceries)

{'Tomato': 5, 'Potato': 3, 'Chilli': 10, 'Onion': 7, 'Cabbage': 1}


In [71]:
# replace value of a specific key
groceries.update({'apple': 2})
print(groceries)

{'Tomato': 5, 'Potato': 3, 'Chilli': 10, 'Onion': 7, 'Cabbage': 1, 'apple': 2}


In [None]:
# count number of items in the dictionary
print(len(groceries))

In [72]:
# looping through dictionary items
for key, value in groceries.items():
    print(key, value)
    
for val in groceries.values():
    print(val)
    
for key in groceries.keys():
    print(key)

Tomato 5
Potato 3
Chilli 10
Onion 7
Cabbage 1
apple 2
5
3
10
7
1
2
Tomato
Potato
Chilli
Onion
Cabbage
apple


In [73]:
# retrieve list of items, keys and values
print(groceries.items())
print(groceries.keys())
print(groceries.values())

dict_items([('Tomato', 5), ('Potato', 3), ('Chilli', 10), ('Onion', 7), ('Cabbage', 1), ('apple', 2)])
dict_keys(['Tomato', 'Potato', 'Chilli', 'Onion', 'Cabbage', 'apple'])
dict_values([5, 3, 10, 7, 1, 2])


### If-else

In [74]:
number = input('Enter a number')
type(number)

Enter a number123


str

In [None]:
# if else 
age = int(input('Enter your age:'))
print(type(age))

if age >= 18 and age<=60:
    print("You're eligible to vote.")
elif age>60:
    print("Senior citizens not allowed.")
else:
    print('You are not eligible to vote.')

### Control Statements in Python
1. Break statement 
2. Continue statement 
3. Pass statement

***Break***

The break statement in Python is used to terminate or abandon the loop containing the statement and brings the control out of the loop. It is used with both the while and the for loops, especially with nested loops (loop within a loop) to quit the loop. It terminates the inner loop and control shifts to the statement in the outer loop.

In [76]:
# break

age = '\n Please enter your age: '
while True:
    age = input()
    if int(age) >= 18:
        break
    else:
        print ("You‚Äôre not eligible to vote")

12
You‚Äôre not eligible to vote
17
You‚Äôre not eligible to vote
11
You‚Äôre not eligible to vote
13
You‚Äôre not eligible to vote
18


***Continue***

When a program encounters a continue statement in Python, it skips the execution of the current iteration when the condition is met and lets the loop continue to move to the next iteration. It is used to continue running the program even after the program encounters a break during execution.

In [77]:
# continue

for letter in 'Flexi ple': 
    if letter == ' ': 
        continue 
    print ('Letters: ', letter)

Letters:  F
Letters:  l
Letters:  e
Letters:  x
Letters:  i
Letters:  p
Letters:  l
Letters:  e


***Pass***

The pass statement is a null operator and is used when the programmer wants to do nothing when the condition is satisfied. This control statement in Python does not terminate or skip the execution, it simply passes to the next iteration.

In [79]:
def function(a, b):
    pass

In [None]:
for letter in 'Flexiple': 
    if letter == 'x': 
        pass 
    print ('Letters: ', letter)

***Types of Arguments:***
1. Required arguments
2. Variable-length arguments
3. Keyword arguments
4. Default arguments

In [80]:
# Required arguments
def say_hello(name):
    return 'Hello ' + name

print(say_hello('John'))
print(say_hello('Mary'))

TypeError: say_hello() missing 1 required positional argument: 'name'

In [81]:
def say_hello(*names):
    for i in names:
        print('Hello '+i)
        
print(say_hello('John', 'Mary'))

Hello John
Hello Mary
None


In [None]:
# Keyword arguments
def hello(**name):
    print("Hello", name['fname'], name['lname'])

hello(fname="Anne", lname="Sullivan")
hello(lname="Pichai", fname="Sundar")
hello(fname="Narendra", mname="Damodar", lname="Modi")

In [82]:
def hello (name="John"):
    print("Hello", name)
    
hello()
hello("Mary")

Hello John
Hello Mary


### Functional Programming
A programming paradigm that uses functions to define computation is known as functional programming.

To know more about the elements of functional programming refer
[this](https://towardsdatascience.com/elements-of-functional-programming-in-python-1b295ea5bbe0)

***Higher-Order Functions***

In functional programming, higher-order functions are our primary tool for defining computation. These are functions that take a function as a parameter and return a function as the result. reduce(), map(), and filter() are three of Python‚Äôs most useful higher-order functions. They can be used to do complex operations when paired with simpler functions.

1. Map
2. Filter
3. Reduce


### üìç map()

SYNTAX: map(function, iterables)

In [83]:
# generate list of squares of original elements
def function(a):
    return a*a

x = map(function, (1,2,3,4))  #x is the map object

print(x)
print(list(x))

<map object at 0x7fd684473d60>
[1, 4, 9, 16]


In [84]:
# lambda with map
tup= (5, 7, 22, 97, 54, 62, 77, 23, 73, 61)
newtuple = tuple(map(lambda x: x+3 , tup)) 
print(newtuple)

(8, 10, 25, 100, 57, 65, 80, 26, 76, 64)


### üìç filter()

SYNTAX: filter (function, iterables)

In [85]:
def func(x):
    if x>=3:
        return x
    
y = filter(func, (1,2,3,4))  
print(y)
print(list(y))

<filter object at 0x7fd684473a00>
[3, 4]


In [86]:
# lambda with filter
y = filter(lambda x: (x%2==0), (1,2,3,4))
print(list(y))

[2, 4]


In [87]:
# intersection of 2 lists
a = [1,2,3,5,7,9]
b = [2,3,5,6,7,8]
print(list(filter(lambda x: x in a, b)))  # prints out [2, 3, 5, 7]

[2, 3, 5, 7]


### üìç reduce()

SYNTAX: reduce(function, iterables)

In [89]:
from functools import reduce
reduce(lambda a,b: a+b,[23,21,45,98])

187

In [91]:
def my_add(a, b):
    result = a + b
    return result

numbers = [0, 1, 2, 3, 4]
reduce(my_add, numbers, 100)

110

### Problems


### Question 1
Write a Python program to count the number of strings where the string length is 2 or more and the first and last character are same from a given list of strings.

Sample List : ['abc', 'xyz', 'aba', '1221']

Expected Result : 2

In [93]:
def count(list_of_strings):
    '''
    This function counts the number of strings satisfying 2 conditions.
    argument: takes a list of strings
    returns: integer
    '''
    count = 0
    for i in list_of_strings:
        if len(i) > 1 and i[0]==i[-1]:
            count += 1
            
    return count

count(['abc', 'xyz', 'aba', '1221'])


2

### Question 2

Write a function that counts vowels and consonants in a word 

word = physics

output:

vowels: 1

consonents: 6

In [95]:
def count (strings):
    v=0
    c=0
    for i in strings:
        if (i.lower() in "aeiou"):
            v=v+1
        else:
            c=c+1
    return v,c

count('physics')

(1, 6)

### Question 3
Write a program to generate the below result, using the map () function

words = 'Data Science Academy offers the best data analysis courses in Brazil'

output:

['DATA', 'data', 4]

['SCIENCE', 'science', 7]

['ACADEMY', 'academy', 7]

['OFFERS', 'offers', 6]

['THE', 'the', 3]

['BEST', 'best', 4]

['DATA', 'data', 4]

['ANALYSIS', 'analysis', 8]

['COURSES', 'courses', 7]

['IN', 'in', 2]

['BRAZIL', 'brazil', 6]

In [98]:
sentence.split()

['Data',
 'Science',
 'Academy',
 'offers',
 'the',
 'best',
 'data',
 'analysis',
 'courses',
 'in',
 'Brazil']

In [100]:
sentence = 'Data Science Academy offers the best data analysis courses in Brazil'

x = map(lambda word: [word.upper(), word.lower(), len(word)], sentence.split())

for i in list(x):
    print(i)


['DATA', 'data', 4]
['SCIENCE', 'science', 7]
['ACADEMY', 'academy', 7]
['OFFERS', 'offers', 6]
['THE', 'the', 3]
['BEST', 'best', 4]
['DATA', 'data', 4]
['ANALYSIS', 'analysis', 8]
['COURSES', 'courses', 7]
['IN', 'in', 2]
['BRAZIL', 'brazil', 6]


## <u>Pandas and Numpy</u>

### üëâ Data Structures in Pandas
1. Dataframe, which is two-dimensional
2. Series, which is one-dimensional


####  üå± using lists
    - We can create a dataframe using lists
    - We pass the list as an argument to the pandas.DataFrame() function which returns a dataframe
    - Pandas automatically assigns numerical row labels to each row of the dataframe
    - By default, pandas also assign numerical column labels to each column if not specified.
    
    
####  üå± using dictionary
    - We can also pass a dictionary to pandas.DataFrame() function to create a dataframe
    - Each key of the array should have a list of one or more values associated with it
    - The keys of the dictionary become column labels
    - Pandas automatically assigns numerical row labels to each of the dataframe

In [102]:
myList = [['apple', 'red'], ['banana', 'yellow'], ['grapes', 'green']]
myList

[['apple', 'red'], ['banana', 'yellow'], ['grapes', 'green']]

In [None]:
import pandas as pd

In [105]:
df = pd.DataFrame(myList, columns=['Fruits', 'Color'])

In [106]:
df

Unnamed: 0,Fruits,Color
0,apple,red
1,banana,yellow
2,grapes,green


### üëâ Loading csv file as a Dataframe
    - We can also load a csv file as a dataframe in pandas using pandas.read_csv() function
    - Each value of the first row of the csv file becomes the column label
    - Pandas automatically assigns numerical row labels to each row of the dataframe

In [21]:
import pandas as pd

In [158]:
# loading excel file
df = pd.read_excel('https://github.com/pranalibose/Tutorials/blob/main/Pandas/orders_data.xlsx?raw=true')
df

Unnamed: 0,order_no,order_date,buyer,ship_city,ship_state,sku,description,quantity,item_total,shipping_fee,cod,order_status
0,405-9763961-5211537,"Sun, 18 Jul, 2021, 10:38 pm IST",Mr.,"CHANDIGARH,",CHANDIGARH,SKU: 2X-3C0F-KNJE,100% Leather Elephant Shaped Piggy Coin Bank |...,1,‚Çπ449.00,,,Delivered to buyer
1,404-3964908-7850720,"Tue, 19 Oct, 2021, 6:05 pm IST",Minam,"PASIGHAT,",ARUNACHAL PRADESH,SKU: DN-0WDX-VYOT,Women's Set of 5 Multicolor Pure Leather Singl...,1,‚Çπ449.00,‚Çπ60.18,,Delivered to buyer
2,171-8103182-4289117,"Sun, 28 Nov, 2021, 10:20 pm IST",yatipertin,"PASIGHAT,",ARUNACHAL PRADESH,SKU: DN-0WDX-VYOT,Women's Set of 5 Multicolor Pure Leather Singl...,1,‚Çπ449.00,‚Çπ60.18,,Delivered to buyer
3,405-3171677-9557154,"Wed, 28 Jul, 2021, 4:06 am IST",aciya,"DEVARAKONDA,",TELANGANA,SKU: AH-J3AO-R7DN,Pure 100% Leather Block Print Rectangular Jewe...,1,,,Cash On Delivery,Delivered to buyer
4,402-8910771-1215552,"Tue, 28 Sept, 2021, 2:50 pm IST",Susmita,"MUMBAI,",MAHARASHTRA,SKU: KL-7WAA-Z82I,Pure Leather Sling Bag with Multiple Pockets a...,1,"‚Çπ1,099.00",‚Çπ84.96,,Delivered to buyer
...,...,...,...,...,...,...,...,...,...,...,...,...
166,171-2829978-1258758,"Mon, 13 Dec, 2021, 11:30 am IST",Shahin,"MUMBAI,",MAHARASHTRA,SKU: DN-0WDX-VYOT,Women's Set of 5 Multicolor Pure Leather Singl...,3,"‚Çπ1,347.00",‚Çπ84.96,Cash On Delivery,Delivered to buyer
167,402-3045457-5360311,"Wed, 1 Dec, 2021, 12:18 pm IST",Sharmistha,"DEHRADUN,",UTTARAKHAND,SKU: SB-WDQN-SDN9,Traditional Block-Printed Women's 100% Pure Le...,1,"‚Çπ1,299.00",‚Çπ114.46,,Delivered to buyer
168,408-2260162-8323567,"Thu, 9 Dec, 2021, 6:55 pm IST",shashank,"Durg,",CHHATTISGARH,SKU: SB-WDQN-SDN9,Traditional Block-Printed Women's 100% Pure Le...,1,"‚Çπ1,299.00",‚Çπ105.02,,Delivered to buyer
169,403-5664951-8941100,"Wed, 23 Feb, 2022, 12:43 am IST",Jayeta,"KOLKATA,",WEST BENGAL,SKU: N8-YFZF-P74I,Stylish and Sleek Multiple Pockets 100 Percent...,1,"‚Çπ1,499.00",‚Çπ80.24,Cash On Delivery,Delivered to buyer


### üëâ Examining dataframe

In [116]:
# head, tail, shape, info, describe
df.describe()

Unnamed: 0,quantity
count,171.0
mean,1.087719
std,0.445132
min,1.0
25%,1.0
50%,1.0
75%,1.0
max,4.0


In [120]:
# value_counts, unique
df['ship_state'].unique()

array(['CHANDIGARH', 'ARUNACHAL PRADESH', 'TELANGANA', 'MAHARASHTRA',
       'WEST BENGAL', 'UTTAR PRADESH', 'KARNATAKA', 'CHHATTISGARH',
       'HARYANA', 'TRIPURA', 'TAMIL NADU', 'ODISHA', 'ANDHRA PRADESH',
       'DELHI', 'GOA', 'Odisha', 'JAMMU & KASHMIR', 'GUJARAT', 'ASSAM',
       'KERALA', 'Maharashtra', 'PUNJAB', 'RAJASTHAN', 'CHANDIGARH,',
       'BIHAR', 'MADHYA Pradesh', 'MOHALI,', 'Andhra Pradesh',
       'Himachal Pradesh', 'UTTARAKHAND'], dtype=object)

In [127]:
# loading csv file
df = pd.read_csv('https://github.com/pranalibose/Tutorials/raw/main/Pandas/BB.csv')
df

Unnamed: 0,index,product,category,sub_category,brand,sale_price,market_price,type,rating,description
0,1,Garlic Oil - Vegetarian Capsule 500 mg,Beauty & Hygiene,Hair Care,Sri Sri Ayurveda,220,220,Hair Oil & Serum,4.1,This Product contains Garlic Oil that is known...
1,2,Water Bottle - Orange,"Kitchen, Garden & Pets",Storage & Accessories,Mastercook,180,180,Water & Fridge Bottles,2.3,"Each product is microwave safe (without lid), ..."
2,3,"Brass Angle Deep - Plain, No.2",Cleaning & Household,Pooja Needs,Trm,119,250,Lamp & Lamp Oil,3.4,"A perfect gift for all occasions, be it your m..."
3,4,Cereal Flip Lid Container/Storage Jar - Assort...,Cleaning & Household,Bins & Bathroom Ware,Nakoda,149,176,"Laundry, Storage Baskets",3.7,Multipurpose container with an attractive desi...
4,5,Creme Soft Soap - For Hands & Body,Beauty & Hygiene,Bath & Hand Wash,Nivea,162,162,Bathing Bars & Soaps,4.4,Nivea Creme Soft Soap gives your skin the best...
5,6,Germ - Removal Multipurpose Wipes,Cleaning & Household,All Purpose Cleaners,Nature Protect,169,199,Disinfectant Spray & Cleaners,3.3,Stay protected from contamination with Multipu...
6,7,Multani Mati,Beauty & Hygiene,Skin Care,Satinance,58,58,Face Care,3.6,Satinance multani matti is an excellent skin t...
7,8,Hand Sanitizer - 70% Alcohol Base,Beauty & Hygiene,Bath & Hand Wash,Bionova,250,250,Hand Wash & Sanitizers,4.0,70%Alcohol based is gentle of hand leaves skin...
8,9,Biotin & Collagen Volumizing Hair Shampoo + Bi...,Beauty & Hygiene,Hair Care,StBotanica,1098,1098,Shampoo & Conditioner,3.5,"An exclusive blend with Vitamin B7 Biotin, Hyd..."
9,10,"Scrub Pad - Anti- Bacterial, Regular",Cleaning & Household,"Mops, Brushes & Scrubs",Scotch brite,20,20,"Utensil Scrub-Pad, Glove",4.3,Scotch Brite Anti- Bacterial Scrub Pad thoroug...


In [130]:
# set_index, inplace
df.set_index('index', inplace=True)

In [131]:
df

Unnamed: 0_level_0,product,category,sub_category,brand,sale_price,market_price,type,rating,description
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,Garlic Oil - Vegetarian Capsule 500 mg,Beauty & Hygiene,Hair Care,Sri Sri Ayurveda,220,220,Hair Oil & Serum,4.1,This Product contains Garlic Oil that is known...
2,Water Bottle - Orange,"Kitchen, Garden & Pets",Storage & Accessories,Mastercook,180,180,Water & Fridge Bottles,2.3,"Each product is microwave safe (without lid), ..."
3,"Brass Angle Deep - Plain, No.2",Cleaning & Household,Pooja Needs,Trm,119,250,Lamp & Lamp Oil,3.4,"A perfect gift for all occasions, be it your m..."
4,Cereal Flip Lid Container/Storage Jar - Assort...,Cleaning & Household,Bins & Bathroom Ware,Nakoda,149,176,"Laundry, Storage Baskets",3.7,Multipurpose container with an attractive desi...
5,Creme Soft Soap - For Hands & Body,Beauty & Hygiene,Bath & Hand Wash,Nivea,162,162,Bathing Bars & Soaps,4.4,Nivea Creme Soft Soap gives your skin the best...
6,Germ - Removal Multipurpose Wipes,Cleaning & Household,All Purpose Cleaners,Nature Protect,169,199,Disinfectant Spray & Cleaners,3.3,Stay protected from contamination with Multipu...
7,Multani Mati,Beauty & Hygiene,Skin Care,Satinance,58,58,Face Care,3.6,Satinance multani matti is an excellent skin t...
8,Hand Sanitizer - 70% Alcohol Base,Beauty & Hygiene,Bath & Hand Wash,Bionova,250,250,Hand Wash & Sanitizers,4.0,70%Alcohol based is gentle of hand leaves skin...
9,Biotin & Collagen Volumizing Hair Shampoo + Bi...,Beauty & Hygiene,Hair Care,StBotanica,1098,1098,Shampoo & Conditioner,3.5,"An exclusive blend with Vitamin B7 Biotin, Hyd..."
10,"Scrub Pad - Anti- Bacterial, Regular",Cleaning & Household,"Mops, Brushes & Scrubs",Scotch brite,20,20,"Utensil Scrub-Pad, Glove",4.3,Scotch Brite Anti- Bacterial Scrub Pad thoroug...


In [134]:
# slicing
df[['category', 'sub_category']][:5]

Unnamed: 0_level_0,category,sub_category
index,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Beauty & Hygiene,Hair Care
2,"Kitchen, Garden & Pets",Storage & Accessories
3,Cleaning & Household,Pooja Needs
4,Cleaning & Household,Bins & Bathroom Ware
5,Beauty & Hygiene,Bath & Hand Wash


In [136]:
# filtering
df[(df['category']=='Beauty & Hygiene') & (df['sale_price']>200)]

Unnamed: 0_level_0,product,category,sub_category,brand,sale_price,market_price,type,rating,description
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,Garlic Oil - Vegetarian Capsule 500 mg,Beauty & Hygiene,Hair Care,Sri Sri Ayurveda,220,220,Hair Oil & Serum,4.1,This Product contains Garlic Oil that is known...
8,Hand Sanitizer - 70% Alcohol Base,Beauty & Hygiene,Bath & Hand Wash,Bionova,250,250,Hand Wash & Sanitizers,4.0,70%Alcohol based is gentle of hand leaves skin...
9,Biotin & Collagen Volumizing Hair Shampoo + Bi...,Beauty & Hygiene,Hair Care,StBotanica,1098,1098,Shampoo & Conditioner,3.5,"An exclusive blend with Vitamin B7 Biotin, Hyd..."


In [138]:
df.sort_values(by='sale_price', ascending=False)

Unnamed: 0_level_0,product,category,sub_category,brand,sale_price,market_price,type,rating,description
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
9,Biotin & Collagen Volumizing Hair Shampoo + Bi...,Beauty & Hygiene,Hair Care,StBotanica,1098,1098,Shampoo & Conditioner,3.5,"An exclusive blend with Vitamin B7 Biotin, Hyd..."
8,Hand Sanitizer - 70% Alcohol Base,Beauty & Hygiene,Bath & Hand Wash,Bionova,250,250,Hand Wash & Sanitizers,4.0,70%Alcohol based is gentle of hand leaves skin...
1,Garlic Oil - Vegetarian Capsule 500 mg,Beauty & Hygiene,Hair Care,Sri Sri Ayurveda,220,220,Hair Oil & Serum,4.1,This Product contains Garlic Oil that is known...
2,Water Bottle - Orange,"Kitchen, Garden & Pets",Storage & Accessories,Mastercook,180,180,Water & Fridge Bottles,2.3,"Each product is microwave safe (without lid), ..."
6,Germ - Removal Multipurpose Wipes,Cleaning & Household,All Purpose Cleaners,Nature Protect,169,199,Disinfectant Spray & Cleaners,3.3,Stay protected from contamination with Multipu...
5,Creme Soft Soap - For Hands & Body,Beauty & Hygiene,Bath & Hand Wash,Nivea,162,162,Bathing Bars & Soaps,4.4,Nivea Creme Soft Soap gives your skin the best...
4,Cereal Flip Lid Container/Storage Jar - Assort...,Cleaning & Household,Bins & Bathroom Ware,Nakoda,149,176,"Laundry, Storage Baskets",3.7,Multipurpose container with an attractive desi...
3,"Brass Angle Deep - Plain, No.2",Cleaning & Household,Pooja Needs,Trm,119,250,Lamp & Lamp Oil,3.4,"A perfect gift for all occasions, be it your m..."
7,Multani Mati,Beauty & Hygiene,Skin Care,Satinance,58,58,Face Care,3.6,Satinance multani matti is an excellent skin t...
10,"Scrub Pad - Anti- Bacterial, Regular",Cleaning & Household,"Mops, Brushes & Scrubs",Scotch brite,20,20,"Utensil Scrub-Pad, Glove",4.3,Scotch Brite Anti- Bacterial Scrub Pad thoroug...


In [139]:
# groupby

df = pd.DataFrame({'Gender': ['female', 'male', 'female', 'male', 'male', 'female'], 'Score': [45, 88, 95, 40, 60, 35]})
df

Unnamed: 0,Gender,Score
0,female,45
1,male,88
2,female,95
3,male,40
4,male,60
5,female,35


In [146]:
# get the mean of scores for each gender
df.groupby(df['Gender']).sum()

Unnamed: 0_level_0,Score
Gender,Unnamed: 1_level_1
female,175
male,188


In [124]:
import numpy as np

In [125]:
pd.pivot_table(df, index='Gender', values='Score', aggfunc=np.sum)


Unnamed: 0_level_0,Score
Gender,Unnamed: 1_level_1
female,175
male,188


In [154]:
# merging dataframes

data1 = {
  "name": ["Sally", "Mary", "John"],
  "age": [50, 40, 30]
}

data2 = {
  "name": ["Sally", "Peter", "Micky"],
  "age": [50, 44, 22]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

df1

Unnamed: 0,name,age
0,Sally,50
1,Mary,40
2,John,30


In [155]:
df2

Unnamed: 0,name,age
0,Sally,50
1,Peter,44
2,Micky,22


In [157]:
df1.merge(df2, how='right')

Unnamed: 0,name,age
0,Sally,50
1,Peter,44
2,Micky,22


### Problems

### ‚ùì Task 1
1. Check the datatype of 'quantity' column
2. Change the datatype of quantity column

In [159]:
df['quantity'].dtype

dtype('int64')

In [161]:
df['quantity'] = df['quantity'].astype('int8')
df['quantity'].dtype

dtype('int8')

In [183]:
df.head()

Unnamed: 0,order_no,order_date,buyer,ship_city,ship_state,sku,description,quantity,item_total,shipping_fee,cod,order_status
0,405-9763961-5211537,"Sun, 18 Jul, 2021, 10:38 pm IST",Mr.,"CHANDIGARH,",CHANDIGARH,SKU: 2X-3C0F-KNJE,100% Leather Elephant Shaped Piggy Coin Bank |...,1,‚Çπ449.00,,,Delivered to buyer
1,404-3964908-7850720,"Tue, 19 Oct, 2021, 6:05 pm IST",Minam,"PASIGHAT,",ARUNACHAL PRADESH,SKU: DN-0WDX-VYOT,Women's Set of 5 Multicolor Pure Leather Singl...,1,‚Çπ449.00,‚Çπ60.18,,Delivered to buyer
2,171-8103182-4289117,"Sun, 28 Nov, 2021, 10:20 pm IST",yatipertin,"PASIGHAT,",ARUNACHAL PRADESH,SKU: DN-0WDX-VYOT,Women's Set of 5 Multicolor Pure Leather Singl...,1,‚Çπ449.00,‚Çπ60.18,,Delivered to buyer
3,405-3171677-9557154,"Wed, 28 Jul, 2021, 4:06 am IST",aciya,"DEVARAKONDA,",TELANGANA,SKU: AH-J3AO-R7DN,Pure 100% Leather Block Print Rectangular Jewe...,1,,,Cash On Delivery,Delivered to buyer
4,402-8910771-1215552,"Tue, 28 Sept, 2021, 2:50 pm IST",Susmita,"MUMBAI,",MAHARASHTRA,SKU: KL-7WAA-Z82I,Pure Leather Sling Bag with Multiple Pockets a...,1,"‚Çπ1,099.00",‚Çπ84.96,,Delivered to buyer


### ‚ùì Task 2
1. Display the number of missing values in each column
2. Slice the dataset to fetch the first 5 rows and the last 3 columns
3. Get all the unique values in ship_state column
4. Display the number of unique values in order_status column
5. Select orders with ship_state 'Maharashtra' and cod with missing value
6. Get the total quantity for orders of each state
7. Fetch the maximum quantity ordered by each state

In [185]:
df['quantity'].groupby(df['ship_state']).max()

ship_state
ANDHRA PRADESH       1
ARUNACHAL PRADESH    1
ASSAM                1
Andhra Pradesh       1
BIHAR                1
CHANDIGARH           1
CHANDIGARH,          1
CHHATTISGARH         1
DELHI                4
GOA                  1
GUJARAT              1
HARYANA              3
Himachal Pradesh     1
JAMMU & KASHMIR      1
KARNATAKA            1
KERALA               1
MADHYA Pradesh       1
MAHARASHTRA          4
MOHALI,              1
Maharashtra          1
ODISHA               1
Odisha               1
PUNJAB               1
RAJASTHAN            1
TAMIL NADU           3
TELANGANA            1
TRIPURA              1
UTTAR PRADESH        1
UTTARAKHAND          1
WEST BENGAL          3
Name: quantity, dtype: int8

In [184]:
df['quantity'].groupby(df['ship_state']).sum()

ship_state
ANDHRA PRADESH        3
ARUNACHAL PRADESH     2
ASSAM                 4
Andhra Pradesh        1
BIHAR                 1
CHANDIGARH            3
CHANDIGARH,           1
CHHATTISGARH          6
DELHI                11
GOA                   2
GUJARAT               6
HARYANA              10
Himachal Pradesh      1
JAMMU & KASHMIR       1
KARNATAKA            16
KERALA                4
MADHYA Pradesh        1
MAHARASHTRA          37
MOHALI,               1
Maharashtra           1
ODISHA                2
Odisha                1
PUNJAB                1
RAJASTHAN             3
TAMIL NADU           19
TELANGANA            13
TRIPURA               1
UTTAR PRADESH        12
UTTARAKHAND           1
WEST BENGAL          21
Name: quantity, dtype: int8

In [182]:
df[(df['ship_state']=='Maharashtra') & (df['cod'].isnull())]

Unnamed: 0,order_no,order_date,buyer,ship_city,ship_state,sku,description,quantity,item_total,shipping_fee,cod,order_status
61,402-6806027-8773139,"Mon, 29 Nov, 2021, 10:34 am IST",Rana,"Pune,",Maharashtra,SKU: 0M-RFE6-443C,Set of 2 Pure Leather Block Print Round Jewelr...,1,‚Çπ399.00,‚Çπ84.96,,Delivered to buyer


In [181]:
df['order_status'].value_counts()

Delivered to buyer    160
Returned to seller     11
Name: order_status, dtype: int64

In [178]:
df.loc[:4, 'shipping_fee':]

Unnamed: 0,shipping_fee,cod,order_status
0,,,Delivered to buyer
1,‚Çπ60.18,,Delivered to buyer
2,‚Çπ60.18,,Delivered to buyer
3,,Cash On Delivery,Delivered to buyer
4,‚Çπ84.96,,Delivered to buyer


In [179]:
df.iloc[:5, -3:]

Unnamed: 0,shipping_fee,cod,order_status
0,,,Delivered to buyer
1,‚Çπ60.18,,Delivered to buyer
2,‚Çπ60.18,,Delivered to buyer
3,,Cash On Delivery,Delivered to buyer
4,‚Çπ84.96,,Delivered to buyer


<u>References</u>

https://developers.google.com/edu/python/lists

https://flexiple.com/python/control-statements-in-python/