We can access elements in Pandas DataFrames in many different ways. In general, we can access rows, columns, or individual elements of the DataFrame by using the row and column labels. We will use the same store_items DataFrame created in the previous lesson. Let's see some examples:

In [28]:
import pandas as pd
# We create a list of Python dictionaries
items2 = [{'bikes': 20, 'pants': 30, 'watches': 35},
         {'watches': 10, 'glasses': 50, 'bikes': 15, 'pants': 5}]

# We create a DataFrame
store_items = pd.DataFrame(items2)

store_items = pd.DataFrame(items2, index = ['store 1', 'store 2'])

print(store_items)
print('\n')

# We access rows, columns and elements using labels
print('How many bikes are in each store:\n', store_items[['bikes']])
print('\n')
print('How many baikes and pants are in each store:\n', store_items[['bikes', 'pants']])
print('\n')
print('What items are in Store 1:\n', store_items.loc[['store 1']])
print('\n')
print('How many bikes are in in Store 2:', store_items['bikes']['store 2'])

         bikes  glasses  pants  watches
store 1     20      NaN     30       35
store 2     15     50.0      5       10


How many bikes are in each store:
          bikes
store 1     20
store 2     15


How many baikes and pants are in each store:
          bikes  pants
store 1     20     30
store 2     15      5


What items are in Store 1:
          bikes  glasses  pants  watches
store 1     20      NaN     30       35


How many bikes are in in Store 2: 15


We can also modify our DataFrames by adding rows or columns. Let's start by learning how to add new columns to our DataFrames. Let's suppose we decided to add shirts to the items we have in stock at each store. To do this, we will need to add a new column to our store_items DataFrame indicating how many shirts are in each store. Let's do that:

In [29]:
# We add a new column named shirts to our store_itmes DataFrame indicating that number of shirts in stock at each store 
# We will put 15 shirts in store 1 and 2 shirts in store 2
store_items['shirts'] = [15,2]

store_items

Unnamed: 0,bikes,glasses,pants,watches,shirts
store 1,20,,30,35,15
store 2,15,50.0,5,10,2


We can also add new columns to our DataFrame by using arithmetic operations between other columns in our DataFrame. Let's see an example:

In [30]:
# We make a new column called suits by adding the number shirts and pants
store_items['suits'] = store_items['pants'] + store_items['shirts']

print(store_items)
print('\n')

         bikes  glasses  pants  watches  shirts  suits
store 1     20      NaN     30       35      15     45
store 2     15     50.0      5       10       2      7




Suppose now, that you opened a new store and you need to add the number of items in stock of that new store into your DataFrame. We can do this by adding a new row to the store_items Dataframe. To add rows to our DataFrame we first have to create a new Dataframe and then append it to the original DataFrame. Let's see how this works

In [31]:
new_items = [{'bikes': 20, 'pants': 30, 'watches': 35, 'glasses': 4}]
new_store = pd.DataFrame(new_items, index = ['store 3'])
print(new_store)
print('\n')

         bikes  glasses  pants  watches
store 3     20        4     30       35




We now add this row to our store_items DataFrame by using the .append() method.

In [32]:
# We append store 3 to our store_items DataFrame
store_items = store_items.append(new_store)

store_items

Unnamed: 0,bikes,glasses,pants,shirts,suits,watches
store 1,20,,30,15.0,45.0,35
store 2,15,50.0,5,2.0,7.0,10
store 3,20,4.0,30,,,35


We can also add new columns of our DataFrame by using only data from particular rows in particular columns. For example, suppose that you want to stock stores 2 and 3 with new watches and you want the quantity of the new watches to be the same as the watches already in stock for those stores. Let's see how we can do this

In [33]:
# We add anew column using data from particular rows in the watches column
store_items['new watches'] = store_items['watches'][1:]

print(store_items)
print('\n')

         bikes  glasses  pants  shirts  suits  watches  new watches
store 1     20      NaN     30    15.0   45.0       35          NaN
store 2     15     50.0      5     2.0    7.0       10         10.0
store 3     20      4.0     30     NaN    NaN       35         35.0




It is also possible, to insert new columns into the DataFrames anywhere we want. The dataframe.insert(loc,label,data) method allows us to insert a new column in the dataframe at location loc, with the given column label, and given data. Let's add new column named shoes right before the suits column. Since suits has numerical index value 4 then we will use this value as loc. Let's see how this works:

In [34]:
# We insert a new column with label shoes right before the column with numerical index 4
store_items.insert(4, 'shoes', [8,5,0])

print(store_items)
print('\n')

         bikes  glasses  pants  shirts  shoes  suits  watches  new watches
store 1     20      NaN     30    15.0      8   45.0       35          NaN
store 2     15     50.0      5     2.0      5    7.0       10         10.0
store 3     20      4.0     30     NaN      0    NaN       35         35.0




Just as we can add rows and columns we can also delete them. To delete rows and columns from our DataFrame we will use the .pop() and .drop() methods. The .pop() method only allows us to delete columns, while the .drop() method can be used to delete both rows and columns by use of the axis keyword. Let's see some examples

In [35]:
store_items.pop('new watches')

store_items

Unnamed: 0,bikes,glasses,pants,shirts,shoes,suits,watches
store 1,20,,30,15.0,8,45.0,35
store 2,15,50.0,5,2.0,5,7.0,10
store 3,20,4.0,30,,0,,35


In [37]:
# We remove the wathes and shoes column
store_items = store_items.drop(['watches', 'shoes'], axis=1)

store_items

Unnamed: 0,bikes,glasses,pants,shirts,suits
store 1,20,,30,15.0,45.0
store 2,15,50.0,5,2.0,7.0
store 3,20,4.0,30,,


In [38]:
# We remvoe the store 2 and store 1 rows
store_items = store_items.drop(['store 2', 'store 1'], axis=0)

print(store_items)
print('\n')

         bikes  glasses  pants  shirts  suits
store 3     20      4.0     30     NaN    NaN




Sometimes we might need to change the row and column labels. Let's change the bikes column label to hats using the .rename() method

In [40]:
# We chance the column label bikes to hats
store_items = store_items.rename(columns = {'bikes': 'hats'})

print(store_items)
print('\n')

         hats  glasses  pants  shirts  suits
store 3    20      4.0     30     NaN    NaN




Now let's change the row label using the .rename() method again.

In [41]:
# We change the row label from store 3 to last store
store_items = store_items.rename(index = {'store 3': 'last store'})

print(store_items)
print('\n')

            hats  glasses  pants  shirts  suits
last store    20      4.0     30     NaN    NaN




You can also change the index to be one of the columns in the DataFrame.

In [42]:
# We change the row index to be the data in the pants column
store_items = store_items.set_index('pants')

store_items

Unnamed: 0_level_0,hats,glasses,shirts,suits
pants,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
30,20,4.0,,
