# V15: Loop Data Structure Part 2:

## We have seen how we can iterate over a 2D Numpy array...

## Let's see how it works on Pandas DataFrame:

### iterrows:
1. Let's import the data from brics.csv file.

        import pandas as pd
        brics = pd.read_csv('brics.csv', index_col = 0 )
        for val in brics:
            print(val)
          
   #### O/P will be: All Column names from dataset. 
  
  
2. Iterrows method works:
                    
          import pandas as pd
          brics = pd.read_csv('brics.csv', index_col = 0 )
          
          for lab, row in brics.iterrows():
              print(lab)
              print(row)
          
### Selective column printing from DataFrame:
    import pandas as pd
    brics = pd.read_csv('brics.csv', index_col = 0 )
    
    for lab, row in brics.iterrows():
        print(lab + ':' + row['capital'] )
     
### Add Column:

    import pandas as pd
    brics = pd.read_csv('brics.csv', index_col = 0 )
    
    for lab, row in brics.iterrows():
        # create series on every iteration
        brics.loc[lab, 'name_length'] = len(row['country'])
        
     print(brics)

In [18]:
# Iterating over Pandas DataFrame:

import pandas as pd
brics = pd.read_csv('country.csv', index_col = 0 )
     
for lab, row in brics.iterrows():
    print(lab)
    print(row, '\n\n\n')
    

BR
country         Brazil
capital       Brasilia
area             8.516
population       200.4
Name: BR, dtype: object 



RU
country       Russia
capital       Moscow
area            17.1
population     143.5
Name: RU, dtype: object 



IN
country           India
capital       New Delhi
area              3.286
population       1252.0
Name: IN, dtype: object 



CH
country         China
capital       Beijing
area            9.597
population     1357.0
Name: CH, dtype: object 



SA
country       South Africa
capital           Pretoria
area                 1.221
population           52.98
Name: SA, dtype: object 





In [17]:
# Selecting columns:

import pandas as pd
brics = pd.read_csv('country.csv', index_col = 0 )
    
for lab, row in brics.iterrows():
    
    print(lab + ':' + row['capital'])
        
        

BR:Brasilia
RU:Moscow
IN:New Delhi
CH:Beijing
SA:Pretoria


In [21]:
### Add Column:

import pandas as pd
brics = pd.read_csv('country.csv', index_col = 0 )
    
for lab, row in brics.iterrows():
    # create series on every iteration
    brics.loc[lab, 'name_length'] = len(row['country'])

print(brics)

# Using apply function we can implement above functionality without iterating over a for loop:

brics['name_length'] = brics['country'].apply(len)
print('\n\n\nUsing apply() function we can add a column to dataframe like below: \n\n', brics)

         country    capital    area  population  name_length
BR        Brazil   Brasilia   8.516      200.40          6.0
RU        Russia     Moscow  17.100      143.50          6.0
IN         India  New Delhi   3.286     1252.00          5.0
CH         China    Beijing   9.597     1357.00          5.0
SA  South Africa   Pretoria   1.221       52.98         12.0



Using apply() function we can add a column to dataframe like below: 

          country    capital    area  population  name_length
BR        Brazil   Brasilia   8.516      200.40            6
RU        Russia     Moscow  17.100      143.50            6
IN         India  New Delhi   3.286     1252.00            5
CH         China    Beijing   9.597     1357.00            5
SA  South Africa   Pretoria   1.221       52.98           12


## Example 1: Loop over DataFrame (1)
Iterating over a Pandas DataFrame is typically done with the iterrows() method. Used in a for loop, every observation is iterated over and on every iteration the row label and actual row contents are available:

    for lab, row in brics.iterrows() :
        ...
        
In this and the following exercises you will be working on the cars DataFrame. It contains information on the cars per capita and whether people drive right or left for seven countries in the world.

### Steps: 
1. Write a for loop that iterates over the rows of cars and on each iteration perform two print() calls: one to print out the row label and one to print out all of the rows contents.

In [33]:
# Example 1: Loop over DataFrame (1)¶
import pandas as pd
cars = pd.read_csv('cars.csv', index_col =0)

for lab, row in cars.iterrows():
    print(lab)
    print(row)

US
cars_per_cap              809
country         United States
drives_right             True
Name: US, dtype: object
AUS
cars_per_cap          731
country         Australia
drives_right        False
Name: AUS, dtype: object
JPN
cars_per_cap      588
country         Japan
drives_right    False
Name: JPN, dtype: object
IN
cars_per_cap       18
country         India
drives_right    False
Name: IN, dtype: object
RU
cars_per_cap       200
country         Russia
drives_right      True
Name: RU, dtype: object
MOR
cars_per_cap         70
country         Morocco
drives_right       True
Name: MOR, dtype: object
EG
cars_per_cap       45
country         Egypt
drives_right     True
Name: EG, dtype: object


## Example 2: Loop over DataFrame (2)
The row data that's generated by iterrows() on every run is a Pandas Series. This format is not very convenient to print out. Luckily, you can easily select variables from the Pandas Series using square brackets:

    for lab, row in brics.iterrows() :
        print(row['country'])
        
### Steps: 
1. Using the iterators lab and row, adapt the code in the for loop such that the first iteration prints out "US: 809", the second iteration "AUS: 731", and so on.
2. The output should be in the form "country: cars_per_cap". Make sure to print out this exact string (with the correct spacing).
    * You can use str() to convert your integer data to a string so that you can print it in conjunction with the country label.
    
    

In [32]:
#Example 2: Loop over DataFrame (2)
for lab, row in cars.iterrows():
    print(lab + ':' + str(row['cars_per_cap']))

US:809
AUS:731
JPN:588
IN:18
RU:200
MOR:70
EG:45


## Example 3: Add column (1)
In the video, Hugo showed you how to add the length of the country names of the brics DataFrame in a new column:

    for lab, row in brics.iterrows() :
        brics.loc[lab, "name_length"] = len(row["country"])

You can do similar things on the cars DataFrame.

### Steps: 
1. Use a for loop to add a new column, named COUNTRY, that contains a uppercase version of the country names in the "country" column. You can use the string method upper() for this.
2. To see if your code worked, print out cars. Don't indent this code, so that it's not part of the for loop.



In [31]:
#Example 3: Add column (1)
for lab, row in cars.iterrows():
    cars.loc[lab, 'COUNTRY'] = row['country'].upper()


print(cars)

     cars_per_cap        country  drives_right        COUNTRY
US            809  United States          True  UNITED STATES
AUS           731      Australia         False      AUSTRALIA
JPN           588          Japan         False          JAPAN
IN             18          India         False          INDIA
RU            200         Russia          True         RUSSIA
MOR            70        Morocco          True        MOROCCO
EG             45          Egypt          True          EGYPT


## Example 4: Add column (2)
Using iterrows() to iterate over every observation of a Pandas DataFrame is easy to understand, but not very efficient. On every iteration, you're creating a new Pandas Series.

If you want to add a column to a DataFrame by calling a function on another column, the iterrows() method in combination with a for loop is not the preferred way to go. Instead, you'll want to use apply().

Compare the iterrows() version with the apply() version to get the same result in the brics DataFrame:

    for lab, row in brics.iterrows() :
        brics.loc[lab, "name_length"] = len(row["country"])
    
    brics["name_length"] = brics["country"].apply(len)
    
We can do a similar thing to call the upper() method on every name in the country column. However, upper() is a method, so we'll need a slightly different approach:

### Steps: 
1. Replace the for loop with a one-liner that uses .apply(str.upper). The call should give the same result: a column COUNTRY should be added to cars, containing an uppercase version of the country names.
2. As usual, print out cars to see the fruits of your hard labor


In [38]:
# Example 4: Add column (2)

cars['COUNTRY'] = cars['country'].apply(str.upper)

print(cars)

     cars_per_cap        country  drives_right        COUNTRY
US            809  United States          True  UNITED STATES
AUS           731      Australia         False      AUSTRALIA
JPN           588          Japan         False          JAPAN
IN             18          India         False          INDIA
RU            200         Russia          True         RUSSIA
MOR            70        Morocco          True        MOROCCO
EG             45          Egypt          True          EGYPT
