Beautiful! Now our SF county tract and ACS data are ready too.

<a id="section5"></a>
## 1.5 Attribute Joins  between Geodataframes and  Dataframes

We just mapped the census tracts. But what makes a map powerful is when you map the data associated with the locations.

In order to map the ACS data we need to associate it with the tracts. We have polygon data in the `tracts_gdf_ac` geodataframe but no attributes of interest.

In a separate file we have our ACS 5-year data for 2018 `census_variables_CA.csv` that we just imported and read in as a `pandas` dataframe. We're now going to join the columns from that data to the `tracts_gdf_ac` with a common key. This process is called an `attribute join`, which we covered in an earlier notebook.

We're going to be conducting an inner join here -- think about why we do one type of join over another. You can read more about merging in `geopandas` [here](http://geopandas.org/mergingdata.html#attribute-joins).

<img src="https://shanelynnwebsite-mid9n9g1q9y8tt.netdna-ssl.com/wp-content/uploads/2017/03/join-types-merge-names.jpg">



Let's talk about the data and the different join operations. What kind of join do we want to do?

In [None]:
# write any notes here

Let's take another look at the two data objects that we have -- do we see any columns that we can join on between the two?

In [None]:
# ACS 5 year data
acs5data_df.columns

Since its hard to see all of our variables and know what types they are, let's use the `info` method instead.

In [None]:
acs5data_df.info()

Okay, awesome! Now let's go ahead and check our our tracts data.

In [None]:
# Tracts data
tracts_gdf_ac.head(2)

So it seems like `GEOID` in our tracts data and `FIPS_11_digit` are going to be the keys in our join. 

<img src="http://www.pngall.com/wp-content/uploads/2016/03/Light-Bulb-Free-PNG-Image.png" width="20" align=left >  Let's check those variables-- do you see any differences?

In [None]:
tracts_gdf_ac['GEOID'].head()

In [None]:
acs5data_df['FIPS_11_digit'].head()

A `join` requires data to be of the same type and same values. Are we good to go?

In [None]:
# Write your thoughts here

Use the `geopandas` `merge` command to join the two dataframes by matching the values in the `GEOID` and `FIPS_11_digit` columns. Then take a look at the output since it should contain our ACS data for Alameda County.

In [None]:
# Uncomment to view documentation 
#acs5data_df_ac.merge?

Let's do a `left` join to keep all of the census tracts in Alameda County and only the ACS data for those tracts.

In [None]:
# Left join keeps all tracts and the acs data for those tracts
tracts_acs_gdf_ac = tracts_gdf_ac.merge(acs5data_df_ac, left_on='GEOID',right_on="FIPS_11_digit", how='left')
tracts_acs_gdf_ac.head(2)

Let's see all the variables we have in our dataset now.

In [None]:
list(tracts_acs_gdf_ac.columns)

How many rows and columns should we have? Think about this before you run the next lines of code.

In [None]:
print("Rows and columns in the Alameda County Census tract gdf:", tracts_gdf_ac.shape)
print("Rows and columns in the Alameda County Census tract gdf joined to the ACS data:", tracts_acs_gdf_ac.shape)

<div style="display:inline-block;vertical-align:top;">
    <img src="http://www.pngall.com/wp-content/uploads/2016/03/Light-Bulb-Free-PNG-Image.png" width="30" align=left > 
</div>  
<div style="display:inline-block;">

#### Question
</div>

1. What would happen if we did a inner join instead of a left join? A right join? 
2. What is data type of output of the merge?

In [None]:
# Put your thoughts here

In [None]:
# Check the data type of the join output
type(tracts_acs_gdf_ac)

### Join Order Matters!

Above, we lefted joined the ACS5 dataframe to the tracts geodataframe. The ouput was a geodataframe of all census tracts and the ACS data for those tracts.

We can do do a similar operation by joining the tracts geodataframe to the ACS dataframe.  However, if we change the order of inputs we get a different type of output!

Let's check that out

In [None]:
tracts_acs_df_ac = acs5data_df_ac.merge(tracts_gdf_ac, right_on='GEOID', left_on="FIPS_11_digit", how='right')

In [None]:
type(tracts_acs_df_ac)

In [None]:
print(tracts_acs_gdf_ac.shape)
print(tracts_acs_df_ac.shape)

In [None]:
tracts_acs_df_ac.columns

The number of rows and columns in the output is the same for both joins but the output type is different - even though the pandas dataframe contains a geometry column.

So be careful when joining Geopandas geodataframes and Pandas dataframes. Always check your outputs to make sure they are what you expect.