<h1>Python 2 - Object Oriented Programming and Pandas</h1>

<!-- :    : -->

<p>4 Pillars of OOP</p>
<ul>
    
<li>Encapsulation: Group related variables and functions together to reduce complexity and increase reusability</li>
<li>Data Abstraction: Creating methods to interface with attributes of your class. Show only essentials to reduce complexity</li>
<li>Inheritance</li>
<li>Polymorphism</li>

</ul>

<h1>Inheritance</h1>
<ul>
    <li>New classes do not need to be declared from scratch. They may build on existing classes</li>
    <li>When one class inherits from another, it automatically takes on all the attributes and methods of the first class</li>
    <li>Goal: Eliminate redundant code by inheriting attributes and methods from a parent class</li>
</ul>


In [35]:
class Car():    
    """A simple attempt to represent a car."""
    def __init__(self, make, model, year):        
        self.make = make        
        self.model = model        
        self.year = year        
        self.odometer_reading = 0            
    
    def get_descriptive_name(self):        
        long_name = str(self.year) + ' ' + self.make + ' ' + self.model        
        return long_name.title()        
    
    def read_odometer(self):        
        print("This car has " + str(self.odometer_reading) + " miles on it.")            
        
    def update_odometer(self, mileage):        
        if mileage >= self.odometer_reading:            
            self.odometer_reading = mileage        
        else:            
            print("You can't roll back an odometer!")        
            
    def increment_odometer(self, miles):        
        self.odometer_reading += miles

        

In [36]:
class ElectricCar(Car):    
    """Represent aspects of a car, specific to electric vehicles."""
    def __init__(self, make, model, year):      
        """Initialize attributes of the parent class."""  
        super().__init__(make, model, year)        
        
        

In [37]:
my_tesla = ElectricCar('tesla', 'model s', 2016) 
print(my_tesla.get_descriptive_name())



2016 Tesla Model S


In [38]:
my_tesla.increment_odometer(10)
my_tesla.read_odometer()



This car has 10 miles on it.


<h1>Polymorphism</h1>

<ul>
    <li>Because child classes inherit all attributes and methods from their parent class, we may wish to refactor and customize classes to specific use cases.</li>
    <li>Overiding involves the redefining of methods to better suit child classes </li>
</ul>

In [39]:
class GasCar(Car):
    def __init__(self, make, model, year):      
        """Initialize attributes of the parent class."""
        super().__init__(make, model, year) 
        
    def get_descriptive_name(self):        
        long_name = str(self.year) + ' ' + self.make + ' '\
                    + self.model + " is a gas car"     
        return long_name.title() 
    
    

In [40]:
my_bmw = GasCar('BMW', 'i8', 2015)
print(my_bmw.get_descriptive_name())



2015 Bmw I8 Is A Gas Car


<h1>Pandas</h1>

In [41]:
import pandas as pd
%matplotlib inline



<h1>Reading CVS Files</h1>

<ul>
    <li>Function to use in Pandas: read_csv()</li>
    <li>Value passed to read_csv() must be string and the <b>exact</b> name of the file</li>
    <li>CSV Files must be in the same directory as the python file/notebook</li>
</ul>

In [42]:
df = pd.read_csv("imports - Sheet1.csv")
#read_excel also an option

#print(df)



<h1>Basic DataFrame Functions</h1>

<ul>
    <li>head() will display the first 5 values of the DataFrame</li>
    <li>tail() will display the last 5 values of the DataFrame </li>
    <li>shape will display the dimensions of the DataFrame</li>
    <li>columns() will return the columns of the DataFrame as a list</li>
    <li>dtypes will display the types of each column of the DataFrame</li>
    <li>drop() will remove a column from the DataFrame</li>
</ul>

In [43]:
df.head()



Unnamed: 0,year,country_origin_id,country_destination_id,hs92_product_id,export_val,export_val_pct
0,1995,VNM,BFA,ALL,67177.77,0.00%
1,1995,VNM,CAF,ALL,514674.15,0.00%
2,1995,VNM,CIV,ALL,58011.71,0.00%
3,1995,VNM,CMR,ALL,97669.0,0.00%
4,1995,VNM,COG,ALL,24018.39,0.00%


In [44]:
df.tail()



Unnamed: 0,year,country_origin_id,country_destination_id,hs92_product_id,export_val,export_val_pct
2425,2015,VNM,ECU,ALL,4412351.39,0.01%
2426,2015,VNM,GUY,ALL,7137466.15,0.02%
2427,2015,VNM,PER,ALL,280650.31,0.00%
2428,2015,VNM,PRY,ALL,16496727.35,0.05%
2429,2015,VNM,URY,ALL,206349.39,0.00%


In [45]:
df.shape



(2430, 6)

In [46]:
df.columns



Index(['year', 'country_origin_id', 'country_destination_id',
       'hs92_product_id', 'export_val', 'export_val_pct'],
      dtype='object')

In [47]:
df.columns = ["year", "country origin", "country destination", 
              "product", "export_val", "export_val_pct"]

df.head()



Unnamed: 0,year,country origin,country destination,product,export_val,export_val_pct
0,1995,VNM,BFA,ALL,67177.77,0.00%
1,1995,VNM,CAF,ALL,514674.15,0.00%
2,1995,VNM,CIV,ALL,58011.71,0.00%
3,1995,VNM,CMR,ALL,97669.0,0.00%
4,1995,VNM,COG,ALL,24018.39,0.00%


In [48]:
df.dtypes



year                     int64
country origin          object
country destination     object
product                 object
export_val             float64
export_val_pct          object
dtype: object

<h1>Indexing and Series Functions</h1>

<ul>
    <li>Columns of a DataFrame can be accessed through the following format: df_name["name_of_column"] </li>
    <li>Columns will be returned as a Series, which have different methods than DataFrames </li>
    <li>A couple useful Series functions: max(), median(), min(), value_counts(), sort_values()</li>
</ul>

In [49]:
df["export_val"]

df["export_val"].max()



2718394688.0

In [50]:
df["export_val"].median()



767979.0700000001

In [51]:
df["export_val"].min()



1000.0

In [52]:
df["year"].value_counts()



2007    131
2005    131
2006    129
2008    124
2003    124
2004    124
2009    123
2010    121
2000    120
2002    120
2001    119
2011    116
1999    114
2012    114
2015    109
2014    109
2013    108
1998    108
1997    101
1996     98
1995     87
Name: year, dtype: int64

In [53]:
df.sort_values(by = "year", ascending = True)
df.head()



Unnamed: 0,year,country origin,country destination,product,export_val,export_val_pct
0,1995,VNM,BFA,ALL,67177.77,0.00%
1,1995,VNM,CAF,ALL,514674.15,0.00%
2,1995,VNM,CIV,ALL,58011.71,0.00%
3,1995,VNM,CMR,ALL,97669.0,0.00%
4,1995,VNM,COG,ALL,24018.39,0.00%


In [54]:
# delete one column
df.drop("export_val_pct", 1).head()



Unnamed: 0,year,country origin,country destination,product,export_val
0,1995,VNM,BFA,ALL,67177.77
1,1995,VNM,CAF,ALL,514674.15
2,1995,VNM,CIV,ALL,58011.71
3,1995,VNM,CMR,ALL,97669.0
4,1995,VNM,COG,ALL,24018.39


In [55]:
# delete multiple columns
df.drop(["export_val_pct", "product"], 1, inplace = True)



In [56]:
df.head()



Unnamed: 0,year,country origin,country destination,export_val
0,1995,VNM,BFA,67177.77
1,1995,VNM,CAF,514674.15
2,1995,VNM,CIV,58011.71
3,1995,VNM,CMR,97669.0
4,1995,VNM,COG,24018.39


<h1>Indexing</h1>

<ul>
    <li>Because Pandas will select entries based on column values by default, selecting data based on row values requires the use of the iloc method. 
    </li>
    <li>
      Allowed inputs are:
        <ul>
            <li>An integer, e.g. 5.</li>
            <li>A list or array of integers, e.g. [4, 3, 0].</li>
            <li>A slice object with ints, e.g. 1:7.</li>
        </ul>
    </li>
</ul>

In [57]:
#Retrieve a couple rows from their index values
df.iloc[[0]]
df.iloc[[0, 1]]



Unnamed: 0,year,country origin,country destination,export_val
0,1995,VNM,BFA,67177.77
1,1995,VNM,CAF,514674.15


In [58]:
#Similar to arrays, we can use splicing to access multiple rows
df.iloc[:5]



Unnamed: 0,year,country origin,country destination,export_val
0,1995,VNM,BFA,67177.77
1,1995,VNM,CAF,514674.15
2,1995,VNM,CIV,58011.71
3,1995,VNM,CMR,97669.0
4,1995,VNM,COG,24018.39


In [59]:
#We may also provide specific row/column values to access specific values
df.iloc[0, 1]



'VNM'

In [60]:
#Multiple rows and specific columns
df.iloc[[0, 2], [1, 3]]



Unnamed: 0,country origin,export_val
0,VNM,67177.77
2,VNM,58011.71


In [61]:
#We can also splice multiple rows / columns
df.iloc[1:3, 0:3]



Unnamed: 0,year,country origin,country destination
1,1995,VNM,CAF
2,1995,VNM,CIV


In [62]:
#How to iterate over rows
for index, row in df.iterrows():
    print(f'Export from {row["country origin"]} to {row["country destination"]} of {row["export_val"]}')    
          
          

Export from VNM to BFA of 67177.77
Export from VNM to CAF of 514674.15
Export from VNM to CIV of 58011.71
Export from VNM to CMR of 97669.0
Export from VNM to COG of 24018.39
Export from VNM to DZA of 3045918.0
Export from VNM to EGY of 2004172.01
Export from VNM to ETH of 6721108.07
Export from VNM to GIN of 501237.81
Export from VNM to MDG of 58962.92
Export from VNM to MUS of 1735714.92
Export from VNM to NER of 59760.85
Export from VNM to SDN of 1379844.58
Export from VNM to SYC of 10551.0
Export from VNM to TCD of 63364.31
Export from VNM to TGO of 270465.31
Export from VNM to TUN of 1369375.58
Export from VNM to TZA of 148144.85
Export from VNM to UGA of 1103468.68
Export from VNM to ZAF of 1086686.0
Export from VNM to ZMB of 498829.15
Export from VNM to ZWE of 3461937.03
Export from VNM to BGD of 356926.28
Export from VNM to CHN of 58936545.91
Export from VNM to HKG of 31827160.99
Export from VNM to IDN of 25947698.89
Export from VNM to IND of 33478005.23
Export from VNM to JOR 

Export from VNM to MDA of 120681.0
Export from VNM to MKD of 23820.28
Export from VNM to NLD of 4009683.52
Export from VNM to NOR of 4884.0
Export from VNM to POL of 10227912.81
Export from VNM to PRT of 68877.1
Export from VNM to ROU of 366593.17
Export from VNM to RUS of 8275783.17
Export from VNM to SVK of 816962.57
Export from VNM to SVN of 4405859.01
Export from VNM to SWE of 318987.88
Export from VNM to UKR of 12437023.21
Export from VNM to YUG of 73958.53
Export from VNM to BHS of 35504.42
Export from VNM to BRB of 54665.17
Export from VNM to CAN of 343905.08
Export from VNM to CRI of 5211472.31
Export from VNM to GRL of 5932.18
Export from VNM to GTM of 2260841.77
Export from VNM to HND of 4391761.86
Export from VNM to JAM of 1949925.13
Export from VNM to LCA of 317221.75
Export from VNM to MEX of 100420898.0
Export from VNM to SLV of 4844854.65
Export from VNM to TTO of 2458977.82
Export from VNM to AUS of 4577761.75
Export from VNM to NZL of 5308.93
Export from VNM to PNG of 

Export from VNM to THA of 49037954.31
Export from VNM to TUR of 2419341.43
Export from VNM to ALB of 322882.13
Export from VNM to AND of 82357.0
Export from VNM to AUT of 1453533.02
Export from VNM to BGR of 2491235.21
Export from VNM to BIH of 1020432.57
Export from VNM to BLR of 76943.3
Export from VNM to BLX of 1369019.0
Export from VNM to CHE of 875065.66
Export from VNM to CZE of 2731002.36
Export from VNM to DEU of 21699164.58
Export from VNM to DNK of 2170445.68
Export from VNM to ESP of 8391461.1
Export from VNM to EST of 219913.95
Export from VNM to FIN of 390965.87
Export from VNM to FRA of 7738581.67
Export from VNM to GBR of 281625.78
Export from VNM to GRC of 80464.0
Export from VNM to HRV of 1027029.83
Export from VNM to HUN of 1740958.64
Export from VNM to IRL of 6332249.59
Export from VNM to ITA of 3531560.06
Export from VNM to LTU of 420696.22
Export from VNM to LVA of 5727.0
Export from VNM to MDA of 25278.3
Export from VNM to MKD of 49994.0
Export from VNM to NLD of 

Export from VNM to SVN of 7448834.28
Export from VNM to SWE of 19291.0
Export from VNM to UKR of 6020531.47
Export from VNM to ATG of 72865.87
Export from VNM to BLZ of 85393.0
Export from VNM to BRB of 594110.93
Export from VNM to CAN of 50541.67
Export from VNM to CRI of 1017308.57
Export from VNM to DMA of 2668.07
Export from VNM to DOM of 59163434.17
Export from VNM to GRL of 14372.0
Export from VNM to GTM of 111576.2
Export from VNM to HND of 4269311.28
Export from VNM to JAM of 3059414.69
Export from VNM to KNA of 132137.28
Export from VNM to LCA of 2221.0
Export from VNM to MEX of 36717514.66
Export from VNM to MSR of 27796.0
Export from VNM to NIC of 4911761.49
Export from VNM to SLV of 5634120.61
Export from VNM to TTO of 4229286.51
Export from VNM to VCT of 78461.31
Export from VNM to AUS of 6235238.29
Export from VNM to COK of 79818.0
Export from VNM to FJI of 271310.55
Export from VNM to NCL of 333868.67
Export from VNM to PYF of 87598.62
Export from VNM to WSM of 16308.0
E

Export from VNM to KGZ of 1369.0
Export from VNM to LBN of 1333759.74
Export from VNM to LKA of 7455222.38
Export from VNM to MAC of 47554.0
Export from VNM to PAK of 40071952.93
Export from VNM to PSE of 657776.7
Export from VNM to SAU of 19862247.51
Export from VNM to SGP of 734293.2
Export from VNM to THA of 20315904.6
Export from VNM to TUR of 3295941.65
Export from VNM to YEM of 540458.46
Export from VNM to ALB of 106409.13
Export from VNM to AUT of 840649.51
Export from VNM to BGR of 9408.0
Export from VNM to BIH of 374574.33
Export from VNM to BLR of 681761.21
Export from VNM to BLX of 382309.01
Export from VNM to CHE of 967878.0
Export from VNM to CZE of 2357209.1
Export from VNM to DEU of 15177674.8
Export from VNM to DNK of 187193.74
Export from VNM to ESP of 2330744.75
Export from VNM to EST of 114504.46
Export from VNM to FIN of 411647.84
Export from VNM to FRA of 8093027.92
Export from VNM to GBR of 4544391.35
Export from VNM to GRC of 65311.1
Export from VNM to HRV of 143

<h1>Conditional Indexing</h1>

<ul>
    <li>Conditional Operators (>, ==, >=) can be used to return rows based on their values </li>
    <li>Bitwise Operators (|, &) can be used to combine conditonal statements</li>
</ul>

In [63]:
df_1995 = df[df["year"] == 1995]

df_1995.head()



Unnamed: 0,year,country origin,country destination,export_val
0,1995,VNM,BFA,67177.77
1,1995,VNM,CAF,514674.15
2,1995,VNM,CIV,58011.71
3,1995,VNM,CMR,97669.0
4,1995,VNM,COG,24018.39


In [64]:
df_2000s = df[df["year"] > 1999]

df_2000s.head()



Unnamed: 0,year,country origin,country destination,export_val
508,2000,VNM,BEN,923912.58
509,2000,VNM,BFA,339732.75
510,2000,VNM,CAF,33662.13
511,2000,VNM,CIV,342503.71
512,2000,VNM,CMR,1447.32


In [65]:
caf_1995 = df[(df["year"]  == 1995) & (df["country destination"]  == "CAF")]
caf_1995.head()



Unnamed: 0,year,country origin,country destination,export_val
1,1995,VNM,CAF,514674.15


In [66]:
df[(df["year"]  == 1995) | (df["year"]  == 1996)].head()



Unnamed: 0,year,country origin,country destination,export_val
0,1995,VNM,BFA,67177.77
1,1995,VNM,CAF,514674.15
2,1995,VNM,CIV,58011.71
3,1995,VNM,CMR,97669.0
4,1995,VNM,COG,24018.39


In [67]:
# find the exports to CAN in 1995

# find the exports to CAN for years greater than 1999
df[df["country destination"] == "CHN"].head()

Unnamed: 0,year,country origin,country destination,export_val
23,1995,VNM,CHN,58936550.0
112,1996,VNM,CHN,61753460.0
212,1997,VNM,CHN,108174900.0
315,1998,VNM,CHN,49851200.0
422,1999,VNM,CHN,38343320.0


<h1>Formatting Data</h1>

<ul>
    <li>To access and format the string values of a DataFrame, we can access methods within the "str" module of the DataFrame </li>
    <li>We may also format float values using options.display.float_format() in Pandas</li>
</ul>

In [68]:
df["country origin"] = df["country origin"].str.replace("VNM", "Vietnam")



In [69]:
df.head()



Unnamed: 0,year,country origin,country destination,export_val
0,1995,Vietnam,BFA,67177.77
1,1995,Vietnam,CAF,514674.15
2,1995,Vietnam,CIV,58011.71
3,1995,Vietnam,CMR,97669.0
4,1995,Vietnam,COG,24018.39


In [70]:
pd.options.display.float_format = "{:.2f}".format
df.head()



Unnamed: 0,year,country origin,country destination,export_val
0,1995,Vietnam,BFA,67177.77
1,1995,Vietnam,CAF,514674.15
2,1995,Vietnam,CIV,58011.71
3,1995,Vietnam,CMR,97669.0
4,1995,Vietnam,COG,24018.39


In [71]:
df.to_csv("exports.csv")
#to_excel also an option

