# Lab 11

_[General notebook information](https://computing-in-context.afeld.me/notebooks.html)_

We are going to look at the population count of different community districts over time.


## Step 0

Read the data from the [New York City Population By Community Districts](https://data.cityofnewyork.us/City-Government/New-York-City-Population-By-Community-Districts/xi7c-iiu2/data) data set into a DataFrame called `pop_by_cd`. To get the URL:

1. Visit the page linked above.
1. Click `Export`.
1. Right-click `CSV`.
1. Click `Copy Link Address` (or `Location`, depending on your browser).


In [1]:
import pandas as pd
import plotly.express as px

pop_by_cd = pd.read_csv(
    "https://data.cityofnewyork.us/api/views/xi7c-iiu2/rows.csv?accessType=DOWNLOAD",
    low_memory=False
)

pop_by_cd.head()


Unnamed: 0,Borough,CD Number,CD Name,1970 Population,1980 Population,1990 Population,2000 Population,2010 Population
0,Bronx,1,"Melrose, Mott Haven, Port Morris",138557,78441,77214,82159,91497
1,Bronx,2,"Hunts Point, Longwood",99493,34399,39443,46824,52246
2,Bronx,3,"Morrisania, Crotona Park East",150636,53635,57162,68574,79762
3,Bronx,4,"Highbridge, Concourse Village",144207,114312,119962,139563,146441
4,Bronx,5,"University Hts., Fordham, Mt. Hope",121807,107995,118435,128313,128200


## Step 1

Prepare the data. Use the following code to [reshape](https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html#melt-and-wide-to-long) the DataFrame to have one row per community district per Census year.


In [4]:
# Step 1: reshape data

# turn the population columns into rows
populations = pd.melt(
    pop_by_cd,
    id_vars=["Borough", "CD Number", "CD Name"],
    var_name="year",
    value_name="population",
)

# turn the years into numbers
populations["year"] = populations["year"].str.replace(" Population", "").astype(int)

populations


Unnamed: 0,Borough,CD Number,CD Name,year,population
0,Bronx,1,"Melrose, Mott Haven, Port Morris",1970,138557
1,Bronx,2,"Hunts Point, Longwood",1970,99493
2,Bronx,3,"Morrisania, Crotona Park East",1970,150636
3,Bronx,4,"Highbridge, Concourse Village",1970,144207
4,Bronx,5,"University Hts., Fordham, Mt. Hope",1970,121807
...,...,...,...,...,...
290,Queens,13,"Queens Village, Rosedale",2010,188593
291,Queens,14,"The Rockaways, Broad Channel",2010,114978
292,Staten Island,1,"Stapleton, Port Richmond",2010,175756
293,Staten Island,2,"New Springville, South Beach",2010,132003


## Step 2

Create a line chart of the population over time for each community district in Manhattan. There should be [one line for each](https://plotly.com/python/line-charts/#Line-Plots-with-column-encoding-color).


In [7]:
populations["population"] = (
    populations["population"]
    .astype(str)
    .str.replace(",", "")
    .astype(int)
)

figure = px.line(
    populations[populations["Borough"] == "Manhattan"],
    x="year",
    y="population",
    color="CD Name",
    title="Population of NYC Community Districts Over Time",
    labels={
        "year": "Year",
        "population": "Population",
        "CD Name": "Community District Name"
    }
)

figure.show()



## Step 3

Starting with the same dataset, create a line chart of the population over time for each Borough. There should be one line for each.


In [10]:

borough_pop = (
    populations
    .groupby(["Borough", "year"], as_index=False)["population"]
    .sum()
)

figure = px.line(
    borough_pop,
    x="year",
    y="population",
    color="Borough",
    title="Population of NYC Boroughs Over Time",
    labels={
        "year": "Year",
        "population": "Population",
        "Borough": "Borough"
    }
)

figure.show()


## Step 4

[Submit.](https://computing-in-context.afeld.me/notebooks.html#submission)
