<a href="https://colab.research.google.com/github/zshamroukh/Colab/blob/main/api_use_case_phx_demographics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# City Level Demographic Insights in 3 Steps via the Parcl Labs API

#Step 1: Register for the API

You can retrieve Parcl Labs API data with only a couple lines of GraphQL code. For demonstration purposes here we will show you a use case using Python, one of the most popular programming languages for data analysis, but the API can be used in combination with your favorite tech stack. Once you’ve received your key, by signing up [here](https://www.parcllabs.com/contact?utm_source=ParclLabs&utm_medium=Blog&utm_campaign=API-USE-CASE-BLOG) for free, you can store it as a variable in Python along with the API URL:

In [None]:
import os
import requests
import pandas as pd
import plotly.express as px

# Storing the API credentials as an environment variable
url = "https://api-pilot.parcllabs.com/v1/graphql"
bearer_token = os.environ['PARCL_LABS_API_PILOT_KEY']

# Step 2: Define the GraphQL query

In this example we will look into leveraging the API to understand the demographic breakdowns of the Phoenix Metropolitan Area. First we name our query, in this case PHX_DEMOGRAPHICS, and then we can query all cities within the “Phoenix-Mesa-Chandler, AZ” MSA from the CITY table (see the Parcl Labs API [docs](https://docs.parcllabs.com/docs/msa) for a comprehensive list of geographies and objects available in the API).

After defining the MSA, the columns we output from the CITY table are CITY_NAME and PARCL_ID (our unique identifier of different levels of geography). Within the CITY geographies is a nested table, census, containing the demographic data. In the example below we query census at the geography we defined with a parameter of year equal to 2020 (there are census values for every year and we only want one population value for each category). Finally we can choose the variables to pull from census, in this case: total population, male and female baby boomer population, male and female GenZ population, and male and female millennial population. This is what the query would look like:

In [None]:
#Storing the query to get the Demographic infor for the PHX MSA in 2020
query = """
query PHX_DEMOGRAPHICS {
  CITY(where: {MSA_NAME: {_eq: "Phoenix-Mesa-Chandler, AZ"}}) {
    CITY_NAME
    PARCL_ID
    census(where: {YEAR: {_eq: 2020}}) {
      POP_TOTAL
      Boomers_Male_Population
      Boomers_Female_Population
      Gen_Z_Male_Population
      Gen_Z_Female_Population
      Millennial_Female
      Millennial_Male_Population
  	}
	}
}
"""

#Step 3: Call the API and visualize the results

Once you have plugged in your query, the URL and API key you can call the API with Python:

In [None]:
# Call the API using our query and credentials
response = requests.post(
   url=url,
   json={
     "query": query
   },
   headers={
     "Authorization": f"Bearer {bearer_token}",
     'content-type': "application/json"
   }
)
 
# Parse the JSON response to native Python object
out = response.json()
data = out['data']['CITY']

The next step is to flatten the JSON response into a dataframe and aggregate the demographic columns to drill into population by generation:

In [None]:
# Aggregate the Demographic Columns for visualization
PHX_DEMO_df = pd.json_normalize(data, 'census', ['CITY_NAME'])
PHX_DEMO_df['Boomer Population'] = PHX_DEMO_df['Boomers_Male_Population'] + PHX_DEMO_df['Boomers_Female_Population']
PHX_DEMO_df['Millennial Population'] = PHX_DEMO_df['Millennial_Male_Population'] + PHX_DEMO_df['Millennial_Female']
PHX_DEMO_df['GenZ Population'] = PHX_DEMO_df['Gen_Z_Male_Population'] + PHX_DEMO_df['Gen_Z_Female_Population']

PHX_DEMO_df_clean = PHX_DEMO_df.sort_values(by=['POP_TOTAL'], ascending=False).head(10)

print(PHX_DEMO_df_clean)

    POP_TOTAL  Boomers_Male_Population  Boomers_Female_Population  \
0     1919172                   169927                     183223   
76     698136                    70774                      79521   
87     472180                    42425                      45884   
50     376936                    30171                      34576   
70     371021                    51642                      55196   
18     342427                    30841                      37444   
6      327886                    38051                      44098   
62     314237                    25264                      28294   
74     240948                    27893                      34058   
27     170648                    16814                      17672   

    Gen_Z_Male_Population  Gen_Z_Female_Population  Millennial_Female  \
0                  206864                   197776             215912   
76                  70288                    65205              71710   
87                  5

And finally visualize it with your choice of visualization libraries, here we use [Plotly](https://plotly.com/):

In [None]:
#Plot the Demographics for the top 10 Phoenix Cities
PHX_DEMO_fig = px.bar(PHX_DEMO_df_clean, x='CITY_NAME',
                y=['Boomer Population', 'GenZ Population', "Millennial Population"],
                title='Population by City and Generation: Phoenix MSA',
                width=900, height=500,
                labels={"CITY_NAME": "City"}
                )

#PHX_DEMO_fig.update_traces(line_color='#4882db')
PHX_DEMO_fig.update_layout(
    yaxis=dict(title='Population'),
    barmode='group',
    xaxis={'categoryorder':'total descending'},
    legend=dict(title=None),
    bargap=0.15,
    bargroupgap=0.1
)

PHX_DEMO_fig.show()