# Dictionaries

You use curly brackets to create dictionaries

Like this:

{"afghanistan":30.55}

In [7]:
world_pop = {"afghanistan": 30.55, "albania": 2.77, "algeria": 39.21}

In [8]:
world_pop["albania"]

2.77

That is how you retrieve particular keys from the dictionary.

Keys should be unique. Otherwise, only the last key is kept.

Keys also have to be immutable objects so they cannot be changed after they are created. Therefore, lists cannot be used as keys for dictionaries. However, they can be used as values assigned to a key.

How do you add more data?

In [9]:
world_pop["sealand"] = 0.000027

How do you change data?

In [11]:
world_pop["sealand"] = 0.000028

And to delete....

In [13]:
del(world_pop["sealand"])

With a list, you can select, update and remove that is indexed by a range of numbers. On the other hand, Dictionaries are indexed by unique keys.

If you have a collection of values where order matters and you want to select entire subsets, then you go with a list. If you need a lookup table and need to look up data fast, then you use a dictionary.

If you have a list within a dictionary, you can access part of that list by chaining arguments together.

In [15]:
europe = { 
    'spain': { 'capital':'madrid', 'population':46.77 },
    'france': { 'capital':'paris', 'population':66.03 },
    'germany': { 'capital':'berlin', 'population':80.62 },
    'norway': { 'capital':'oslo', 'population':5.084 } 
}

In [17]:
print(europe['spain']['population'])

46.77


# Pandas

Pandas is a high-level manipulation tool built on the NumPy package.

In [18]:
dict = {
    "country":["Brazil", "Russia", "India", "China", "South Africa"],
    "capital":["Brasilia", "Moscow", "New Delhi", "Beijing", "Pretoria"],
    "area":[8.516, 17.10, 3.286, 9.597, 1.221],
    "population":[200.4, 143.5, 1252, 1357, 52.98]
}

In [19]:
import pandas as pd

In [20]:
brics = pd.DataFrame(dict)

In [22]:
brics

Unnamed: 0,area,capital,country,population
0,8.516,Brasilia,Brazil,200.4
1,17.1,Moscow,Russia,143.5
2,3.286,New Delhi,India,1252.0
3,9.597,Beijing,China,1357.0
4,1.221,Pretoria,South Africa,52.98


In [23]:
brics.index = ["BR", "RU", "IN", "CH", "SA"]

In [24]:
brics

Unnamed: 0,area,capital,country,population
BR,8.516,Brasilia,Brazil,200.4
RU,17.1,Moscow,Russia,143.5
IN,3.286,New Delhi,India,1252.0
CH,9.597,Beijing,China,1357.0
SA,1.221,Pretoria,South Africa,52.98


You can also upload from a CSV file. (sale error porque el path no existe)

In [25]:
brics = pd.reas_csv("path/to/brics.csv", index_col = 0)

AttributeError: module 'pandas' has no attribute 'reas_csv'

To select the country column from brics:

In [26]:
print(brics["country"])

BR          Brazil
RU          Russia
IN           India
CH           China
SA    South Africa
Name: country, dtype: object


A series is a one-dimensional array that can be labeled. To keep the info as a dataframe, you need to add two brackets

In [27]:
print(brics[["country"]])

         country
BR        Brazil
RU        Russia
IN         India
CH         China
SA  South Africa


In [28]:
print(brics[["country", "capital"]])

         country    capital
BR        Brazil   Brasilia
RU        Russia     Moscow
IN         India  New Delhi
CH         China    Beijing
SA  South Africa   Pretoria


You can also get specific rows (let's get the middle slices)

In [29]:
print(brics[1:4])

      area    capital country  population
RU  17.100     Moscow  Russia       143.5
IN   3.286  New Delhi   India      1252.0
CH   9.597    Beijing   China      1357.0


loc: lets you select data based on labels
iloc: lets you select data based on positions

In [30]:
print(brics.loc["RU"])

area            17.1
capital       Moscow
country       Russia
population     143.5
Name: RU, dtype: object


In [32]:
print(brics.loc[["RU"]])

    area capital country  population
RU  17.1  Moscow  Russia       143.5


In [33]:
print(brics.loc[["RU", "IN", "CH"]])

      area    capital country  population
RU  17.100     Moscow  Russia       143.5
IN   3.286  New Delhi   India      1252.0
CH   9.597    Beijing   China      1357.0


In [35]:
print(brics.loc[["RU", "IN", "CH"], ["country", "capital"]])

   country    capital
RU  Russia     Moscow
IN   India  New Delhi
CH   China    Beijing


To select all rows:

In [37]:
print(brics.loc[:, ["country", "capital"]])

         country    capital
BR        Brazil   Brasilia
RU        Russia     Moscow
IN         India  New Delhi
CH         China    Beijing
SA  South Africa   Pretoria
