# Working with Nested Data

### Introduction

Believe it or not, we are now ready to work with data from the web.

Do you feel ready?  Good.

## Getting Our Data

When we get data in the real world, it often comes in the same format: a list of dictionaries.  Good thing for us, we know lists and we know dictionaries.

Let's see this by way of example.  Below is the web address of a website that lists jobs opening from the City of New York.

https://data.cityofnewyork.us/resource/kpav-sd4t.json

If we go there, we'll see something like the following:

<img src="https://github.com/jigsawlabs-student/code-intro/blob/master/job_listings.png?raw=1" width="100%">

> When we go to that URL above, and getting back information, we are accessing an API.  APIs are really cool because they allow programmers to access information gathered by other companies.  For example, above we are getting a list of job openings from Github's API.  If you'd like to see different kinds of data we can get from APIs, [click here](https://github.com/public-apis/public-apis).

Ok, so that mix of text above may look like a mess, but really it is just a list of dictionaries, where each dictionary is a separate position.  If we simplify it a little, the data takes the following form:

```python
[
    {'agency': 'NYC Housing Authority', 'business_title': 'housing assistant'},
    {'agency': 'Office of Management and Budget', 'business_title': 'senior analyst'}
]
```

And while reading through all of the original data is not sustainable, asking questions of a list of dictionaries is.  In fact, we do it all of the time in programming.

So let's learn how to ask questions of a list of dictionaries as it's great training working with data from the Internet.

## Let's get a job

* First review

Remember with a dictionary, we use the squiggly brackets to create the dictionary `{}`.

In [None]:
nyc_job = {'agency': 'NYC Housing Authority', 'business_title': 'housing assistant'}

> Press shift + enter to assign the dictionary to `developer_job`.

And we use the square brackets to access data from a dictionary.

In [None]:
nyc_job['business_title']

'housing assistant'

> As we said, we think of this as asking questions of our dictionary.  Hey `nyc_job`, what is the `'business_title'` of the job? 

And our dictionary responds with `housing assistant`.

* Now for nested data

Ok, now below we assign a a list of jobs to the variable `jobs`.

In [None]:
jobs = [
    {'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
    {'agency': 'Office of Management and Budget', 'business_title': 'Senior Analyst'}
]

jobs

[{'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
 {'agency': 'Office of Management and Budget',
  'business_title': 'Senior Analyst'}]

Remember that we use the index to retreive a specific element from a list.  Ok, let's ask our list above for the first job.

> Press shift + return below.

In [None]:
jobs[0]

{'agency': 'NYC Housing Authority', 'business_title': 'housing assistant'}

> It gives us the first element in the list, the dictionary representing the job from the `NYC Housing Authority`.

Now, on your own, access the second element from the list of jobs in the cell below.

> Think about asking the `jobs` list a question.  If you're correct, you'll see the return value below the cell match what we have commented out.

In [None]:


# {'agency': 'Office of Management and Budget',  'business_title': 'senior analyst'}

### More complicated

Now let's try to dig further inside of this list.

In [None]:
jobs = [
    {'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
    {'agency': 'Office of Management and Budget', 'business_title': 'Senior Analyst'}
]
jobs

[{'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
 {'agency': 'Office of Management and Budget',
  'business_title': 'Senior Analyst'}]

We'll start by trying to select the `title` of the **last element**, which is `Senior Analyst`.  Notice that we can't do this directly.  All we are starting with is a variable that points to the list of `jobs`.  

But in that list of `jobs`, as it's last element is the dictionary representing the second job, and inside that dictionary is the `business_title` of `Senior Analyst`.

So to get there, we need to ask two questions:

1. Hey, `jobs`, what's your second element
2. Hey second element, what's your `business_title`

`jobs -> second job -> title`

So that's the path we need to go through to access our data.  Let's try it.

1. Start with the list of jobs and ask for the second element

In [None]:
jobs[1]

{'agency': 'Office of Management and Budget',
 'business_title': 'Senior Analyst'}

2. Now that we have the second element, ask for the `business_title`

In [None]:
jobs[1]['business_title']

'Senior Analyst'

> Notice that there are no spaces in the line above.

### A little analogy

So we accessed this information with the following logic:

`jobs -> second job -> business_title`

Now for an analogy.  Let's think of our data as a mailbox.

<img src="https://github.com/jigsawlabs-student/code-intro/blob/master/mailboxes.jpg?raw=1" width = 30%>

In [None]:
jobs = [
    {'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
    {'agency': 'Office of Management and Budget', 'business_title': 'Senior Analyst'}
]
jobs

[{'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
 {'agency': 'Office of Management and Budget',
  'business_title': 'Senior Analyst'}]

* The whole gray mailbox represents the list of **jobs**.  
* Each door contains a different element in our list.  
* Then inside the mailbox representing a specific job, there are different pieces of information about that job.

``jobs -> second job -> title``

In [None]:
jobs[1]['title']

'Python Programmer'

Ok, now it's your turn to practice.  Once again, here is our list of jobs.

In [None]:
jobs = [
    {'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
    {'agency': 'Office of Management and Budget', 'business_title': 'Senior Analyst'}
]

Select the agency related to the first job.  You'll see it's correct if pressing `shift + return` returns the string `'NYC Housing Authority'`. 

In [None]:


# 'NYC Housing Authority'

If you were able to do this, congratulations :)  You are certainly on your way.  The reason why being able to work with nested data structures is so important is because we see them so often on the web.

In fact let's go get some live jobs data right now.

### Wrapping Up

In [None]:
import pandas as pd
url = "https://data.cityofnewyork.us/resource/kpav-sd4t.json"
jobs_df = pd.read_json(url)
internet_jobs = jobs_df.to_dict('records')

> Press shift + return on the cell above.

Guess what, we just gathered a list of jobs from the Internet.  Now if we look at the entire list we'll see that it's too much information to handle.  In fact, let's just look at the first dictionary.

In [None]:
internet_jobs[0]

**But** it's just a dictionary, so we can use the same logic as before to select information from the first dictionary in the list -- `internet_jobs[0]['company']`.

In [None]:
internet_jobs[0]['agency']

> So you can see that by knowing how to navigate a list of dictionaries, we can browse data that comes from the Internet.

Try accessing the `agency` from the second element in our `internet_jobs` list.

And in fact, we can use a loop to print out a list of all of the companies we should reach out to.  

In [None]:
for job in internet_jobs:
     print(job['to_apply'])

<p>Email your resume to <a href="mailto:tucker.matheson@markacy.com">tucker.matheson@markacy.com</a></p>

<p>APPLY HERE: <a href="https://us-strozfriedberg-aon.icims.com/jobs/24315/senior-developer/job">https://us-strozfriedberg-aon.icims.com/jobs/24315/senior-developer/job</a></p>

<p><a href="https://jobs.lever.co/sesamecare/26d259b5-5bcc-4ee9-a5d0-068d9c856381?lever-origin=applied&amp;lever-source%5B%5D=GitHub">https://jobs.lever.co/sesamecare/26d259b5-5bcc-4ee9-a5d0-068d9c856381?lever-origin=applied&amp;lever-source%5B%5D=GitHub</a></p>

<p>Apply Here: <a href="http://www.Click2apply.net/cyg9sy9m99zdm9v2">http://www.Click2apply.net/cyg9sy9m99zdm9v2</a></p>



> But we're getting ahead of ourselves.  We'll dedicate future lessons to looping through a list of dictionaries.

### Summary

In this lesson, we saw how to work with nested data structures.  We saw that we can represent a table of data as a list of dictionaries.

In [None]:
jobs = [
    {'agency': 'NYC Housing Authority', 'business_title': 'Housing Assistant'},
    {'agency': 'Office of Management and Budget', 'business_title': 'Senior Analyst'}
]

And we can select this data by working from the outside in.  So if we want to get the company of a specific job, we take the following path.

`jobs -> first job -> agency`

In [None]:
jobs[0]['agency']

'NYC Housing Authority'

If you feel like you need some more practice with this, take some time and practice.  In the next, lesson, we'll learn how to loop through our data.

> For more practice click on the video below.

[Nested data review video](https://www.youtube.com/watch?v=OPry8oI-bg0&list=PLCG6Te769p1gkVJizwSmo6GoEI9oHoAPA&index=15&ab_channel=JigsawLabs)

<right> 
<a href="https://colab.research.google.com/github/jigsawlabs-student/code-intro/blob/master/7-make-it-easy.ipynb">
<img src="https://github.com/jigsawlabs-student/code-intro/blob/master/next-yellow.jpg?raw=1" align="right" style="padding-right: 20px" width="10%">
    </a>
</right>

<center>
<a href="https://www.jigsawlabs.io" style="position: center"><img src="https://github.com/jigsawlabs-student/code-intro/blob/master/jigsaw-icon.png?raw=1" width="15%" style="text-align: center"></a>
</center>

### Answers

In [None]:
jobs = [{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

In [None]:
jobs[1]

{'company': 'Bento Box', 'title': 'Python Programmer'}

In [None]:
jobs[0]['company']

'Postmates'

In [None]:
internet_jobs[1]['company']