In [1]:
# (22/09/2022) Thursday

# Lets Explore:

<img src='https://www.bing.com/th?id=ABTC03DDE83A354D3B1E5177A0F20DB8A20E902DC1CBB3047DF7639CDF4B2204D0B&w=608&h=200&c=2&rs=1&o=6&pid=SANGAM'>

**Last Dollar Road, Colorado, USA:** 


There’s a chance this dazzling view of autumn foliage is not where you think it is. While New England garners most of the attention in the United States this time of year, the high country of Colorado is spectacular, too. This thicket of aspen trees is on the Last Dollar Road, a scenic drive in southwest Colorado that is as dramatic as its name suggests, with stunning views of peaks and meadows, and of course the aspen trees that colour the landscape.

Aspens thrive in the cold winters and cool summers of Colorado, where they grow at altitudes between 1,500 to 3,500 metres, typically reaching heights of 15 metres. Aspens generally grow on west-facing slopes, basking in the afternoon sun. They’re among the world’s largest living organisms because aspen groves share a single root system. They’re also the state’s only native deciduous tree and cover about a fifth of its forested land.

There are many places to see them, but none better than the Last Dollar Road. It’s unique among scenic byways in the United States. For one, it’s not a highway at all, but a dirt road. The road is never plowed, so when the first snow falls, the road closes and stays closed until spring arrives. That makes this the last time of year you can take this drive, and it happens to be the most beautiful. Just make sure your vehicle is capable of the journey!

In [2]:
# CLass Starts from here:

### LOC and ILOC:

<img src='https://miro.medium.com/max/1400/1*h0mnGqz4mDWK1hT-4cccYQ.png' width=75%>

#### Concept of LOC and ILOC:

<img src='https://i0.wp.com/sparkbyexamples.com/wp-content/uploads/2021/10/pandas-difference-loc-vs-iloc.png?resize=840%2C353&ssl=1' width=75%>

<img src='https://miro.medium.com/max/1400/1*OIG9kZYmgVsS_r3niQl_LA.png' width=75%>

<img src='https://www.datasciencemadesimple.com/wp-content/uploads/2017/11/iloc-and-loc-syntax.png' width=75%>

In [3]:
import pandas as pd

import numpy as np

In [4]:
data = pd.DataFrame(np.arange(16).reshape((4, 4)),
                    index=['Ohio', 'Colorado', 'Utah', 'New York'],
                    columns=['one', 'two', 'three', 'four']
                   )

In [5]:
data

Unnamed: 0,one,two,three,four
Ohio,0,1,2,3
Colorado,4,5,6,7
Utah,8,9,10,11
New York,12,13,14,15


In [6]:
data.loc['Utah']

one       8
two       9
three    10
four     11
Name: Utah, dtype: int32

In [7]:
data.iloc[2]

one       8
two       9
three    10
four     11
Name: Utah, dtype: int32

In [8]:
data.loc['Ohio':'Utah']

Unnamed: 0,one,two,three,four
Ohio,0,1,2,3
Colorado,4,5,6,7
Utah,8,9,10,11


In [9]:
data.iloc[0:3]

Unnamed: 0,one,two,three,four
Ohio,0,1,2,3
Colorado,4,5,6,7
Utah,8,9,10,11


In [10]:
data.loc['Ohio':'Colorado','one':'three']

Unnamed: 0,one,two,three
Ohio,0,1,2
Colorado,4,5,6


In [11]:
data.iloc[0:2,0:3]

Unnamed: 0,one,two,three
Ohio,0,1,2
Colorado,4,5,6


In [12]:
data.loc['Ohio','two']

1

In [13]:
data.iloc[0,1]

1

<img src='https://miro.medium.com/max/1334/1*w-hu_MU94ioIeRkJfezE4A.png' width=75%>

In [14]:
data.iloc[:, :3][data.three > 6]

Unnamed: 0,one,two,three
Utah,8,9,10
New York,12,13,14


In [15]:
# Arithmetic methods with fill values:
df1 = pd.DataFrame(np.arange(12.).reshape((3, 4)), columns=list('abcd'))
df2 = pd.DataFrame(np.arange(20).reshape((4,5)), columns=list('abcde'))

In [16]:
df1

Unnamed: 0,a,b,c,d
0,0.0,1.0,2.0,3.0
1,4.0,5.0,6.0,7.0
2,8.0,9.0,10.0,11.0


In [17]:
df2

Unnamed: 0,a,b,c,d,e
0,0,1,2,3,4
1,5,6,7,8,9
2,10,11,12,13,14
3,15,16,17,18,19


In [18]:
df1 + df2

Unnamed: 0,a,b,c,d,e
0,0.0,2.0,4.0,6.0,
1,9.0,11.0,13.0,15.0,
2,18.0,20.0,22.0,24.0,
3,,,,,


In [19]:
df3 = df1.add(df2, fill_value=0)
df3

Unnamed: 0,a,b,c,d,e
0,0.0,2.0,4.0,6.0,4.0
1,9.0,11.0,13.0,15.0,9.0
2,18.0,20.0,22.0,24.0,14.0
3,15.0,16.0,17.0,18.0,19.0


In [20]:
df3[df3['b'] < 15]

Unnamed: 0,a,b,c,d,e
0,0.0,2.0,4.0,6.0,4.0
1,9.0,11.0,13.0,15.0,9.0


In [21]:
df3[df3['b'] > 15]

Unnamed: 0,a,b,c,d,e
2,18.0,20.0,22.0,24.0,14.0
3,15.0,16.0,17.0,18.0,19.0


In [22]:
df3[df3['b'] > 15][['c','d']]

Unnamed: 0,c,d
2,22.0,24.0
3,17.0,18.0


# Lambda Function:

lambda is an anonymous function to dealt with! Simple function can only be created, not the complex ones.

<img src='https://miro.medium.com/max/709/1*pF-YBYsPCNFk3yK9hVcEtg.png' width=75%>

<img src='https://www.softwaretestinghelp.com/wp-content/qa/uploads/2021/02/fig1_lambda-expression.jpg' width=75%>

<img src='https://i.pinimg.com/originals/b8/0e/19/b80e19765f360202766df527088deb06.gif'>

<a href='https://www.programiz.com/python-programming/anonymous-function'> Learn More about Lambda Function!</a>

In [23]:
data.four = data.four.apply(lambda x:x*10)

In [24]:
data

Unnamed: 0,one,two,three,four
Ohio,0,1,2,30
Colorado,4,5,6,70
Utah,8,9,10,110
New York,12,13,14,150


In [25]:
data['Status'] =  data.four.apply(lambda x: 'Big' if x>100 else 'Small')

In [26]:
data

Unnamed: 0,one,two,three,four,Status
Ohio,0,1,2,30,Small
Colorado,4,5,6,70,Small
Utah,8,9,10,110,Big
New York,12,13,14,150,Big


<img src='https://appdividend.com/wp-content/uploads/2020/02/How-to-Use-If-Else-and-Elif-in-Lambda-Functions-in-Python.png' width=75%>

<img src='https://miro.medium.com/max/1400/1*lqk_tWTr13ncxgYfetdexg.png' width=75%>

<img src='https://cdn.techbeamers.com/wp-content/uploads/2018/09/Python-Lambda-Anonymous-nameless-Function.png' width=75%>

In [27]:
def test(arr):
    
        if arr > 100:
            
            return  'Big'
        else:
            return 'Small'

In [28]:
arr = [12,32,323, 45, 678, 789, 100]
map(test, arr)

<map at 0x15c85c35f10>

In [29]:
list(map(test, arr))

['Small', 'Small', 'Big', 'Small', 'Big', 'Big', 'Small']

In [30]:
my_list = [1, 5, 4, 6, 8, 11, 3, 12]

new_list = list(filter(lambda x: (x%2 == 0) , my_list))

print(new_list)

[4, 6, 8, 12]


In [31]:
data['mapbylambdafunction'] = list(map(lambda x: 'Big' if x>100 else 'Small', data['four']))

In [32]:
data

Unnamed: 0,one,two,three,four,Status,mapbylambdafunction
Ohio,0,1,2,30,Small,Small
Colorado,4,5,6,70,Small,Small
Utah,8,9,10,110,Big,Big
New York,12,13,14,150,Big,Big


<a href='https://grow.google/intl/en_pk/certificates/?utm_source=google&utm_medium=hpp&utm_campaign=Google%20Career%20Certifications%20(GCC)%20Launch%20Pakistan'> Google Certificate! Just Do it ! </a>

In [33]:
employee = pd.DataFrame({
    'id':[123,124,125,126,127,128,129],
    'name':["Kamal","Hassan","Umer",'Ali',"Taqi","Zain","Noah"],
    'age':[23,np.nan,34,54,34,21,45],
    "salary":[np.nan,30,56,76,89,45,90],
    "Depart":["Prod","Supply","Accounts", "Accounts",'Prod',"Prod","DataScience"],
    "Service":[5,6,7,np.nan,2,1,9]})

In [34]:
employee

Unnamed: 0,id,name,age,salary,Depart,Service
0,123,Kamal,23.0,,Prod,5.0
1,124,Hassan,,30.0,Supply,6.0
2,125,Umer,34.0,56.0,Accounts,7.0
3,126,Ali,54.0,76.0,Accounts,
4,127,Taqi,34.0,89.0,Prod,2.0
5,128,Zain,21.0,45.0,Prod,1.0
6,129,Noah,45.0,90.0,DataScience,9.0


In [35]:
employee.columns

Index(['id', 'name', 'age', 'salary', 'Depart', 'Service'], dtype='object')

In [36]:
employee.info

<bound method DataFrame.info of     id    name   age  salary       Depart  Service
0  123   Kamal  23.0     NaN         Prod      5.0
1  124  Hassan   NaN    30.0       Supply      6.0
2  125    Umer  34.0    56.0     Accounts      7.0
3  126     Ali  54.0    76.0     Accounts      NaN
4  127    Taqi  34.0    89.0         Prod      2.0
5  128    Zain  21.0    45.0         Prod      1.0
6  129    Noah  45.0    90.0  DataScience      9.0>

In [37]:
employee.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 6 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   id       7 non-null      int64  
 1   name     7 non-null      object 
 2   age      6 non-null      float64
 3   salary   6 non-null      float64
 4   Depart   7 non-null      object 
 5   Service  6 non-null      float64
dtypes: float64(3), int64(1), object(2)
memory usage: 464.0+ bytes


In [38]:
employee.shape

(7, 6)

In [39]:
employee.describe() # Stats only applies on numeric value, not on alphabetic data

Unnamed: 0,id,age,salary,Service
count,7.0,6.0,6.0,6.0
mean,126.0,35.166667,64.333333,5.0
std,2.160247,12.67149,24.598103,3.03315
min,123.0,21.0,30.0,1.0
25%,124.5,25.75,47.75,2.75
50%,126.0,34.0,66.0,5.5
75%,127.5,42.25,85.75,6.75
max,129.0,54.0,90.0,9.0


In [40]:
# EDA : Exploratory Data Analysis: Exploring Data values

# Inferential Analysis: Applying stats on data to find future insights.

In [41]:
employee.salary.median()

66.0

In [42]:
employee.salary.mode()

0    30.0
1    45.0
2    56.0
3    76.0
4    89.0
5    90.0
dtype: float64

In [43]:
employee["salary"] = [30,30,56,76,89,45,90]

In [44]:
employee.salary.mode()

0    30
dtype: int64

In [45]:
employee2 = pd.DataFrame({
    'Employee id':[123,124,125,126,127,128,129],
    'Employee name':["Kamal","Hassan","Umer",'Ali',"Taqi","Zain","Noah"],
    'Employee age':[23,np.nan,34,54,34,21,45],
    "Employee  monthly salary":[np.nan,30,56,76,89,45,90],
    "Employee Department":["Prod","Supply","Accounts", "Accounts",'Prod',"Prod","DataScience"],
    "Employee Service as per year":[5,6,7,np.nan,2,1,9]})

In [46]:
employee2

Unnamed: 0,Employee id,Employee name,Employee age,Employee monthly salary,Employee Department,Employee Service as per year
0,123,Kamal,23.0,,Prod,5.0
1,124,Hassan,,30.0,Supply,6.0
2,125,Umer,34.0,56.0,Accounts,7.0
3,126,Ali,54.0,76.0,Accounts,
4,127,Taqi,34.0,89.0,Prod,2.0
5,128,Zain,21.0,45.0,Prod,1.0
6,129,Noah,45.0,90.0,DataScience,9.0


In [47]:
employee2.columns

Index(['Employee id', 'Employee name', 'Employee age',
       'Employee  monthly salary', 'Employee Department',
       'Employee Service as per year'],
      dtype='object')

In [48]:
employee2.columns = ['id','name','age','salary','dept','service']

In [49]:
employee2

Unnamed: 0,id,name,age,salary,dept,service
0,123,Kamal,23.0,,Prod,5.0
1,124,Hassan,,30.0,Supply,6.0
2,125,Umer,34.0,56.0,Accounts,7.0
3,126,Ali,54.0,76.0,Accounts,
4,127,Taqi,34.0,89.0,Prod,2.0
5,128,Zain,21.0,45.0,Prod,1.0
6,129,Noah,45.0,90.0,DataScience,9.0


In [50]:
employee3 = pd.DataFrame({
    'Employee id':[np.nan,123,124,125,126,np.nan,127,128,129],
    'Employee name':["Abdullah","Kamal","Hassan","Kashif","Umer",'Ali',"Taqi","Zain","Noah"],
    'Employee age':[24,23,np.nan,34,54,34,34,21,45],
    "Employee  monthly salary":[np.nan,np.nan,30,np.nan,56,76,89,45,90],
    "Employee Department":["DataScience","Prod","Prod","Supply","Accounts", "Accounts",'Prod',"Prod","DataScience"],
    "Employee Service as per year":[np.nan,5,6,np.nan,7,np.nan,2,1,9]})


In [51]:
employee3

Unnamed: 0,Employee id,Employee name,Employee age,Employee monthly salary,Employee Department,Employee Service as per year
0,,Abdullah,24.0,,DataScience,
1,123.0,Kamal,23.0,,Prod,5.0
2,124.0,Hassan,,30.0,Prod,6.0
3,125.0,Kashif,34.0,,Supply,
4,126.0,Umer,54.0,56.0,Accounts,7.0
5,,Ali,34.0,76.0,Accounts,
6,127.0,Taqi,34.0,89.0,Prod,2.0
7,128.0,Zain,21.0,45.0,Prod,1.0
8,129.0,Noah,45.0,90.0,DataScience,9.0


In [52]:
employee3.dropna(subset=["Employee id"], axis=0, inplace=True)

In [53]:
employee3

Unnamed: 0,Employee id,Employee name,Employee age,Employee monthly salary,Employee Department,Employee Service as per year
1,123.0,Kamal,23.0,,Prod,5.0
2,124.0,Hassan,,30.0,Prod,6.0
3,125.0,Kashif,34.0,,Supply,
4,126.0,Umer,54.0,56.0,Accounts,7.0
6,127.0,Taqi,34.0,89.0,Prod,2.0
7,128.0,Zain,21.0,45.0,Prod,1.0
8,129.0,Noah,45.0,90.0,DataScience,9.0


In [54]:
#Done