![https://storage.googleapis.com/kaggle-competitions/kaggle/22420/logos/header.png?t=2020-09-08-21-21-20](https://storage.googleapis.com/kaggle-competitions/kaggle/22420/logos/header.png?t=2020-09-08-21-21-20)

CDP is a global non-profit that drives companies and governments to reduce their greenhouse gas emissions, safeguard water resources, and protect forests. Each year, CDP takes the information supplied in its annual reporting process and scores companies and cities based on their journey through disclosure and towards environmental leadership.

CDP houses the world’s largest, most comprehensive dataset on environmental action. As the data grows to include thousands more companies and cities each year, there is increasing potential for the data to be utilized in impactful ways. Because of this potential, CDP is excited to launch an analytics challenge for the Kaggle community. Data scientists will scour environmental information provided to CDP by disclosing companies and cities, searching for solutions to our most pressing problems related to climate change, water security, deforestation, and social inequity.

* How do you help cities adapt to a rapidly changing climate amidst a global pandemic, but do it in a way that is socially equitable?

* What are the projects that can be invested in that will help pull cities out of a recession, mitigate climate issues, but not perpetuate racial/social inequities?

* What are the practical and actionable points where city and corporate ambition join, i.e. where do cities have problems that corporations affected by those problems could solve, and vice versa?

* How can we measure the intersection between environmental risks and social equity, as a contributor to resiliency?

Overview of the Data:

The CDP dataset consists of publicly available responses to 3 different surveys: (1) corporate climate change disclosures; (2) corporate water security disclosures; and (3) disclosures from cities. Data is available for 2018, 2019, and 2020, along with a small collection of supplementary datasets. A starter notebook demonstrates how to load and work with the data.

Main Data:
* 2018_Full_Climate_Change_Dataset.csv (298.53 MB)
* 2019_Full_Climate_Change_Dataset.csv (879.62 MB)
* 2020_Full_Climate_Change_Dataset.csv (511.43 MB)
* 2018_Full_Water_Security_Dataset.csv (64.52 MB)
* 2019_Full_Water_Security_Dataset.csv (128.6 MB)
* 2020_Full_Water_Security_Dataset.csv(127.32 MB)
* 2018_Full_Cities_Dataset.csv (71.16 MB)
* 2019_Full_Cities_Dataset.csv (214.66 MB)
* 2020_Full_Cities_Dataset.csv (365.1 MB)

![https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1314380%2F6f0f4d334e5b094bfcf002c4d2e931f6%2FCDP_dataset.png?generation=1603468553539656&alt=media](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1314380%2F6f0f4d334e5b094bfcf002c4d2e931f6%2FCDP_dataset.png?generation=1603468553539656&alt=media)

See Detailed :https://www.kaggle.com/c/cdp-unlocking-climate-solutions/data

# Ok Let's Visualize the data

Import Library

In [None]:
import numpy as np
import pylab as pl
import pandas as pd
import matplotlib.pyplot as plt 
%matplotlib inline
import wordcloud
from wordcloud import WordCloud,STOPWORDS
import seaborn as sns
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
from sklearn.utils import shuffle
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix,classification_report
from sklearn.model_selection import cross_val_score, GridSearchCV
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

# Read Dataset

# Cities Disclosing

In [None]:
Cities_2018 = pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Disclosing/2018_Cities_Disclosing_to_CDP.csv')
Cities_2019 = pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Disclosing/2019_Cities_Disclosing_to_CDP.csv')
Cities_2020 = pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Disclosing/2020_Cities_Disclosing_to_CDP.csv')


Data= Cities_2018.append([Cities_2018,Cities_2019, Cities_2020])
x = Data.iloc[:, [3]].values

In [None]:
Data.head()

# Population

### Here we are comparing the Population with each Country, first group the Country and get max,min and avg (mean) of Population

In [None]:

display(Data[["Country","Population",]].groupby(["Country"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

In [None]:
Data['Account_Number'] = Data['Account Number']
Data['Population_Year'] = Data['Population Year']
Data['Reporting_Authority'] = Data['Reporting Authority']

Data

## Population by time series 


In [None]:
#Data['Population_Year'] = Data["Population_Year"].astype("int")
#print('population by time series ')
display(Data[["Population_Year",'Country','City','Population','Reporting_Authority']].groupby(["Country","City",
                                                        "Population_Year", ]).agg("sum").sort_values(by="Population",
                                                          ascending = False).head(100).style.background_gradient(cmap='autumn'))

## Population distribution

In [None]:
# Plot Population distribution
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15,4))
plt.subplots_adjust(wspace=0.5, hspace=0.5)

sns.distplot(Data.Population[(Data.Population>=0) & (Data.Population<=10000)], kde=False, ax=axes[0], axlabel='Population')
sns.distplot(Data.Population[(Data.Population>=0) & (Data.Population<=100000)], kde=False, ax=axes[1], axlabel='Population')
sns.distplot(Data.Population[(Data.Population>=0) & (Data.Population<=1000000)], kde=False, ax=axes[2], axlabel='Population')

## Population by Account_Number

In [None]:
population_1 = Data.sort_values(by='Account_Number', ascending=True)[:100]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=population_1.Country, x=population_1.Account_Number)
plt.xticks(rotation=90)
plt.xlabel('Account_Number')
plt.ylabel('Population ')
plt.title('Population by Account_Number')
plt.show()

## Country Population

In [None]:
population = Data.sort_values(by='Population', ascending=True)[:100]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=population.Country, x=population.Population)
plt.xticks()
plt.xlabel('Population')
plt.ylabel('Country')
plt.title('Country Population ')
plt.show()

## City Population

In [None]:
population = Data.sort_values(by='Population', ascending=True)[:100]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=population.City, x=population.Population)
plt.xticks()
plt.xlabel('Population')
plt.ylabel('City')
plt.title('City Population')
plt.show()

##  Year Reported to CDP

In [None]:
Data['Year_Reported_to_CDP'] = Data['Year Reported to CDP']
Data['Account_Number'] = Data['Account Number']
Data

In [None]:
YR_CDP = Data['Year_Reported_to_CDP'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(YR_CDP.index, YR_CDP.values, alpha=0.8)
plt.ylabel('Number of Data', fontsize=12)
plt.xlabel('Year_Reported_to_CDP', fontsize=9)
plt.xticks(rotation=90)
plt.show();

In the chart above, you can see in the far left 2018 of Year_Reported_to_CDP were higher that has the amount at least 1200

In [None]:
population = Data.sort_values(by='Population', ascending=True)[:100]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=population.Year_Reported_to_CDP, x=population.Population)
plt.xticks(rotation=90)
plt.xlabel('Population')
plt.ylabel('Year Reported to CDP')
plt.title('Year Reported to CDP ')
plt.show()

##  Population Year

In [None]:
population = Data.sort_values(by='Population', ascending=True)[:100]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=population.Population_Year, x=population.Population)
plt.xticks(rotation=90)
plt.xlabel('Population')
plt.ylabel('Population_Year')
plt.title('Population_Year ')
plt.show()

In [None]:
cdp = pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Disclosing/Cities_Disclosing_to_CDP_Data_Dictionary.csv')


In [None]:
cdp.head()

In [None]:
# Start with one description:
text = cdp.description[1]

# Create and generate a word cloud image:
wordcloud = WordCloud().generate(text)

# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

In [None]:
# lower max_font_size, change the maximum number of word and lighten the background:
wordcloud = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(text)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

In [None]:
text = " ".join(review for review in cdp.description)
print ("There are {} words in the CDP.".format(len(text)))

In [None]:
# Create stopword list:
stopwords = set(STOPWORDS)
stopwords.update(["city", "population", "organisation"])

# Generate a word cloud image
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(text)

# Display the generated image:
# the matplotlib way:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

# Cities Responses

In [None]:
Responses_2018= pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Responses/2018_Full_Cities_Dataset.csv')
Responses_2019= pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Responses/2019_Full_Cities_Dataset.csv')
Responses_2020= pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Responses/2020_Full_Cities_Dataset.csv')


Data2= Responses_2018.append([Responses_2018,Responses_2019, Responses_2020])
x = Data2.iloc[:, [3]].values


In [None]:
Data2['CDP_Region'] = Data2['CDP Region']
Data2['Account_Number'] = Data2['Account Number']
Data2['Parent_Section'] = Data2['Parent Section']
Data2['Question_Number'] = Data2['Question Number']
Data2['Question_Name'] = Data2['Question Name']
Data2['Response_Answer'] = Data2['Response Answer']
Data2['Last_update'] = Data2['Last update']
Data2

## Account_Number

In [None]:
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objs as go

cr = Data2['Account_Number'].value_counts().reset_index()
cr.columns = [
    'Account_Number', 
    'count'
]
cr['Account_Number'] = cr['Account_Number'].astype(str) + '-'
cr = cr.sort_values(['count']).tail(50)

fig = px.bar(
    cr, 
    x='count', 
    y='Account_Number', 
    orientation='h', 
    title='Top 50 users (Account_Number) by number of actions', 
    width=500,
    height=900 
)

fig.show()

## Question_Number

In [None]:
cr = Data2['Question_Number'].value_counts().reset_index()
cr.columns = [
    'Question_Number', 
    'percent'
]
cr['percent'] /= len(Data2)

fig = px.pie(
    cr, 
    names='Question_Number', 
    values='percent', 
    title='User questions', 
    width=1000,
    height=500 
)

fig.show()

## Number of Section

In [None]:
cnt_pro = Data2['Section'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(cnt_pro.index, cnt_pro.values, alpha=0.8)
plt.ylabel('Number of Section', fontsize=12)
plt.xlabel('Section', fontsize=9)
plt.xticks(rotation=90)
plt.show();

In the chart above, you can see in the far left Citywide GHG Emission Data of a number of sections were higher that has the amount at least 250000.

## Number of Parent Section

In [None]:
cnt_pro = Data2['Parent Section'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(cnt_pro.index, cnt_pro.values, alpha=0.8)
plt.ylabel('Number of Parent Section', fontsize=12)
plt.xlabel('Parent Section', fontsize=9)
plt.xticks(rotation=90)
plt.show();

In the chart above, you can see in the far left Emissions Reductions of a number of sections were higher that has the amount at least 24000.


# Cities_Responses each Country

In [None]:
def Cities_Responses(x):
    y = Data2[["Country","Account_Number","Organization","Section","CDP_Region","Question_Number","Question_Name", "Response_Answer","Last_update"]][Data2["Country"] == x]
    y = y.sort_values(by="Account_Number",ascending=False)
    return y.head(500)

## Brazil

In [None]:
Cities_Responses("Brazil")

## Portugal

In [None]:
Cities_Responses("Portugal")

## Malaysia

In [None]:
Cities_Responses("Malaysia")

## Denmark

In [None]:
Cities_Responses("Denmark")

## Argentina

In [None]:
Cities_Responses("Argentina")

## Mexico

In [None]:
Cities_Responses("Mexico")

## Costa Rica

In [None]:
Cities_Responses("Costa Rica")


## Netherlands

In [None]:
Cities_Responses("Netherlands")


## Turkey

In [None]:
Cities_Responses("Turkey")


## Ecuador

In [None]:
Cities_Responses("Ecuador")


## Colombia

In [None]:
Cities_Responses("Colombia")


## Honduras

In [None]:
Cities_Responses("Honduras")


## Namibia

In [None]:
Cities_Responses("Namibia")


## Sweden

In [None]:
Cities_Responses("Sweden")


## Colombia

In [None]:
Cities_Responses("Colombia")


## Argentina

In [None]:
Cities_Responses("Argentina")


## United Kingdom of Great Britain and Northern Ireland

In [None]:
Cities_Responses("United Kingdom of Great Britain and Northern Ireland")


## Philippines

In [None]:
Cities_Responses("Philippines")


## Peru

In [None]:
Cities_Responses("Peru")


## Germany

In [None]:
Cities_Responses("Germany")


## France

In [None]:
Cities_Responses("France")

## Norway

In [None]:

Cities_Responses("Norway")

## Canada

In [None]:
Cities_Responses("Canada")

## Italy

In [None]:
Cities_Responses("Italy")

## Indonesia

In [None]:
Cities_Responses("Indonesia")

## India

In [None]:
Cities_Responses("India")

## United States of America

In [None]:
Cities_Responses("United States of America")

## Thailand

In [None]:
Cities_Responses("Thailand")

## China

In [None]:
Cities_Responses("China")

## Republic of Korea

In [None]:
Cities_Responses("Republic of Korea")

In [None]:
Full_Cities_Response= pd.read_csv('../input/cdp-unlocking-climate-solutions/Cities/Cities Responses/Full_Cities_Response_Data_Dictionary.csv')


In [None]:
Full_Cities_Response.head()

In [None]:
field = Full_Cities_Response['field'].value_counts()  [:100]
plt.figure(figsize=(6,4))
sns.barplot(field.index, field.values, alpha=0.8)
plt.ylabel('Number of field', fontsize=12)
plt.xlabel('field', fontsize=9)
plt.xticks(rotation=90)
plt.show();

# Corporations

## Climate Change

In [None]:
Climate_Change_2018= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Climate Change/2018_Corporates_Disclosing_to_CDP_Climate_Change.csv')
Climate_Change_2019= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Climate Change/2019_Corporates_Disclosing_to_CDP_Climate_Change.csv')
Climate_Change_2020= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Climate Change/2020_Corporates_Disclosing_to_CDP_Climate_Change.csv')


Data3= Climate_Change_2018.append([Climate_Change_2018,Climate_Change_2019, Climate_Change_2020])
x = Data2.iloc[:, [3]].values


In [None]:
Data3.head()

# Governance

## Organization	

In [None]:
org = Data3.sort_values(by='account_number', ascending=True)[:50]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=org.organization, x=org.account_number)
plt.xticks(rotation=90)
plt.xlabel('account_number')
plt.ylabel('organization ')
plt.title('Number of organization')
plt.show()

## Minimum_tier

In [None]:
min_tier = Data3['minimum_tier'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(min_tier.index, min_tier.values, alpha=0.8)
plt.ylabel('Number of minimum_tier', fontsize=12)
plt.xlabel('minimum_tier', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Authority_types

In [None]:
A_type = Data3['authority_types'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(A_type.index, A_type.values, alpha=0.8)
plt.ylabel('Number of Authority_types', fontsize=12)
plt.xlabel('authority_types', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Activities

In [None]:
Act = Data3['activities'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(Act.index, Act.values, alpha=0.8)
plt.ylabel('Number of activities', fontsize=12)
plt.xlabel('activities', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Sectors	

In [None]:
sectors = Data3['sectors'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(sectors.index, sectors.values, alpha=0.8)
plt.ylabel('Number of sectors', fontsize=12)
plt.xlabel('sectors', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Industries	

In [None]:
industry = Data3['industries'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(industry.index, industry.values, alpha=0.8)
plt.ylabel('Number of industries', fontsize=12)
plt.xlabel('industries', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Primary_industry

In [None]:
P_industry = Data3['primary_industry'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(P_industry.index, P_industry.values, alpha=0.8)
plt.ylabel('Number of primary_industry', fontsize=12)
plt.xlabel('primary_industry', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Primary_sector	

In [None]:
P_sector = Data3['primary_sector'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(P_sector.index, P_sector.values, alpha=0.8)
plt.ylabel('Number of primary_sector', fontsize=12)
plt.xlabel('primary_sector', fontsize=9)
plt.xticks(rotation=90)
plt.show();

In [None]:
display(Data3[Data3["public"]=="public"][["country","response_received_date","authority_types","activities","sectors","industries","primary_questionnaire_sector",
                                       "survey_year"]].sort_values(by="survey_year", ascending= False).head(500).style.background_gradient(cmap="spring"))

In [None]:
Climate_Change_Dict= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Climate Change/Corporations_Disclosing_to_CDP_Data_Dictionary.csv')


In [None]:
Climate_Change_Dict.head()

In [None]:
Climate_Change_Dict = Climate_Change_Dict['field'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(Climate_Change_Dict.index, Climate_Change_Dict.values, alpha=0.8)
plt.ylabel('Number of field', fontsize=12)
plt.xlabel('field', fontsize=9)
plt.xticks(rotation=90)
plt.show();

# Water Security

UN-Water proposes the following definition of water security: "The capacity of a population to safeguard sustainable access to adequate quantities of acceptable quality water for sustaining livelihoods, human well-being, and socio-economic development, for ensuring protection against water-borne pollution and water-related disasters, and for preserving ecosystems in a climate of peace and political stability

source :https://www.unwater.org/publications/water-security-infographic/


In [None]:
Water_Security_2018= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Water Security/2018_Corporates_Disclosing_to_CDP_Water_Security.csv')
Water_Security_2019= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Water Security/2019_Corporates_Disclosing_to_CDP_Water_Security.csv')
Water_Security_2020= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Water Security/2020_Corporates_Disclosing_to_CDP_Water_Security.csv')


Data4= Water_Security_2018.append([Water_Security_2018,Water_Security_2019, Water_Security_2020])
x = Data4.iloc[:, [3]].values


In [None]:
Data4.head()

# Governance

## organization

In [None]:
org = Data4.sort_values(by='account_number', ascending=True)[:50]
figure = plt.figure(figsize=(10,6))
sns.barplot(y=org.organization, x=org.account_number)
plt.xticks(rotation=90)
plt.xlabel('account_number')
plt.ylabel('organization ')
plt.title('Number of organization')
plt.show()

## Country

In [None]:
country = Data4['country'].value_counts() 
plt.figure(figsize=(6,4))
sns.barplot(country.index, country.values, alpha=0.8)
plt.ylabel('Number of country', fontsize=12)
plt.xlabel('country', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## samples	

In [None]:
smp = Data4['samples'].value_counts()  
plt.figure(figsize=(6,4))
sns.barplot(smp.index, smp.values, alpha=0.8)
plt.ylabel('Number of samples', fontsize=12)
plt.xlabel('samples', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Minimum_tier

In [None]:
min_tier = Data4['minimum_tier'].value_counts()  
plt.figure(figsize=(6,4))
sns.barplot(min_tier.index, min_tier.values, alpha=0.8)
plt.ylabel('Number of minimum_tier', fontsize=12)
plt.xlabel('minimum_tier', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Authority_types

In [None]:
A_type = Data4['authority_types'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(A_type.index, A_type.values, alpha=0.8)
plt.ylabel('Number of Authority_types', fontsize=12)
plt.xlabel('authority_types', fontsize=9)
plt.xticks(rotation=90)
plt.show();

 ## Activities

In [None]:
Act = Data4['activities'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(Act.index, Act.values, alpha=0.8)
plt.ylabel('Number of activities', fontsize=12)
plt.xlabel('activities', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Sectors

In [None]:
sectors = Data4['sectors'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(sectors.index, sectors.values, alpha=0.8)
plt.ylabel('Number of sectors', fontsize=12)
plt.xlabel('sectors', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Industries

In [None]:
industry = Data4['industries'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(industry.index, industry.values, alpha=0.8)
plt.ylabel('Number of industries', fontsize=12)
plt.xlabel('industries', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Primary_industry

In [None]:
P_industry = Data4['primary_industry'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(P_industry.index, P_industry.values, alpha=0.8)
plt.ylabel('Number of primary_industry', fontsize=12)
plt.xlabel('primary_industry', fontsize=9)
plt.xticks(rotation=90)
plt.show();

## Primary_sector

In [None]:
P_sector = Data4['primary_sector'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(P_sector.index, P_sector.values, alpha=0.8)
plt.ylabel('Number of primary_sector', fontsize=12)
plt.xlabel('primary_sector', fontsize=9)
plt.xticks(rotation=90)
plt.show();

In [None]:
display(Data4[Data4["public"]=="public"][["country","authority_types","samples","minimum_tier","activities","sectors","industries","primary_questionnaire_sector",
                                       "survey_year"]].sort_values(by="survey_year", ascending= False).head(100).style.background_gradient(cmap="spring"))

In [None]:
def Water_Security(x):
    y = Data4[["account_number","country","organization","region","samples","minimum_tier","response_received_date","sectors", "industries","primary_ticker"]][Data4["country"] == x]
    y = y.sort_values(by="account_number",ascending=False)
    return y.head(100)

# United States of America

In [None]:
Water_Security("United States of America")

# Canada

In [None]:
Water_Security("Canada")

In [None]:
Water_Security_Dict= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Disclosing/Water Security/Corporations_Disclosing_to_CDP_Data_Dictionary.csv')


In [None]:
Water_Security_Dict.head()

In [None]:
Water_Security_Dict = Water_Security_Dict['field'].value_counts()  [:50]
plt.figure(figsize=(6,4))
sns.barplot(Water_Security_Dict.index,Water_Security_Dict.values, alpha=0.8)
plt.ylabel('Number of field', fontsize=12)
plt.xlabel('field', fontsize=9)
plt.xticks(rotation=90)
plt.show();

# Corporations Responses

# Climate Change

In [None]:
Full_Climate_Change_2018= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Climate Change/2018_Full_Climate_Change_Dataset.csv')
Full_Climate_Change_2019= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Climate Change/2019_Full_Climate_Change_Dataset.csv')
Full_Climate_Change_2020= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Climate Change/2020_Full_Climate_Change_Dataset.csv')


Data5= Full_Climate_Change_2018.append([Full_Climate_Change_2018,Full_Climate_Change_2019, Full_Climate_Change_2020])
x = Data5.iloc[:, [3]].values


In [None]:
Data5.head()

In [None]:
org = Data5.sort_values(by='account_number', ascending=True)
figure = plt.figure(figsize=(15,6))
sns.barplot(y=org.organization, x=org.account_number)
plt.xticks(rotation=50)
plt.xlabel('account_number')
plt.ylabel('organization ')
plt.title('Number of organization')
plt.show()

In [None]:
display(Data5[Data5["survey_year"]=="2020"][["organization","ors_response_id","question_unique_reference","data_point_name","data_point_id","response_value",
                                       "account_number"]].sort_values(by="account_number", ascending= False).head(5).style.background_gradient(cmap="spring"))

In [None]:
Full_Corporations_Response_Data_Dict= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Climate Change/Full_Corporations_Response_Data_Dictionary copy.csv')


In [None]:
Full_Corporations_Response_Data_Dict.head()

In [None]:
Full_Corporations_Response_Data_Dict = Full_Corporations_Response_Data_Dict['field'].value_counts() [:50]
plt.figure(figsize=(6,4))
sns.barplot(Full_Corporations_Response_Data_Dict.index,Full_Corporations_Response_Data_Dict.values, alpha=0.8)
plt.ylabel('Number of field', fontsize=12)
plt.xlabel('field', fontsize=9)
plt.xticks(rotation=90)
plt.show();

# Water Security

In [None]:
Full_Water_Security_2018= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Water Security/2018_Full_Water_Security_Dataset.csv')
Full_Water_Security_2019= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Water Security/2019_Full_Water_Security_Dataset.csv')
Full_Water_Security_2020= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Water Security/2020_Full_Water_Security_Dataset.csv')


Data6= Full_Water_Security_2018.append([Full_Water_Security_2018,Full_Water_Security_2019, Full_Water_Security_2020])
x = Data6.iloc[:, [3]].values


In [None]:
Data6.head()

In [None]:
cr = Data6['question_number'].value_counts().reset_index()
cr.columns = [
    'question_number', 
    'percent'
]
cr['percent'] /= len(Data6)

fig = px.pie(
    cr, 
    names='question_number', 
    values='percent', 
    title='User questions (Water Security)', 
    width=800,
    height=500 
)

fig.show()

In [None]:
display(Data6[Data6["survey_year"]=="2019"][["organization","response_received_date","accounting_period_to","ors_response_id","question_unique_reference","data_point_name","data_point_id","row_number","response_value",
                                       "survey_year"]].sort_values(by="survey_year", ascending= False).head(5).style.background_gradient(cmap="spring"))

In [None]:
Full_Corporations_Response_Data_Dict= pd.read_csv('../input/cdp-unlocking-climate-solutions/Corporations/Corporations Responses/Water Security/Full_Corporations_Response_Data_Dictionary.csv')


In [None]:
Full_Corporations_Response_Data_Dict

In [None]:
Full_Corporations_Response_Data_Dict = Full_Corporations_Response_Data_Dict['field'].value_counts()
plt.figure(figsize=(6,4))
sns.barplot(Full_Corporations_Response_Data_Dict.index,Full_Corporations_Response_Data_Dict.values, alpha=0.8)
plt.ylabel('Number of field', fontsize=12)
plt.xlabel('field', fontsize=9)
plt.xticks(rotation=90)
plt.show();

# Supplementary Data

In [None]:
CDC_500_Cities_Census_Tract_Data= pd.read_csv('../input/cdp-unlocking-climate-solutions/Supplementary Data/CDC 500 Cities Census Tract Data/500_Cities__Census_Tract-level_Data__GIS_Friendly_Format___2019_release.csv')


Data were provided by the Centers for Disease Control and Prevention (CDC), Division of Population Health, Epidemiology and Surveillance Branch.

In [None]:
CDC_500_Cities_Census_Tract_Data

# Health Census Information

## ACCESS2_CrudePrev

In [None]:
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","ACCESS2_CrudePrev","ACCESS2_Crude95CI"]].groupby(["StateAbbr","PlaceName","ACCESS2_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))


## ARTHRITIS_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","ARTHRITIS_CrudePrev","ARTHRITIS_Crude95CI"]].groupby(["StateAbbr","PlaceName","ARTHRITIS_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","ARTHRITIS_CrudePrev","ARTHRITIS_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## BINGE_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","BINGE_CrudePrev","BINGE_Crude95CI"]].groupby(["StateAbbr","PlaceName","BINGE_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","BINGE_CrudePrev","BINGE_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## BPHIGH_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","BPHIGH_CrudePrev","BPHIGH_Crude95CI"]].groupby(["StateAbbr","PlaceName","BPHIGH_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","BPHIGH_CrudePrev","BPHIGH_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## BPMED_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","BPMED_CrudePrev","BPMED_Crude95CI"]].groupby(["StateAbbr","PlaceName","BPMED_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","BPMED_CrudePrev","BPMED_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## CANCER_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CANCER_CrudePrev","CANCER_Crude95CI"]].groupby(["StateAbbr","PlaceName","CANCER_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CANCER_CrudePrev","CANCER_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## CASTHMA_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CASTHMA_CrudePrev","CASTHMA_Crude95CI"]].groupby(["StateAbbr","PlaceName","CASTHMA_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CASTHMA_CrudePrev","CASTHMA_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## CHD_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CHD_CrudePrev","CHD_Crude95CI"]].groupby(["StateAbbr","PlaceName","CHD_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CHD_CrudePrev","CHD_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CHECKUP_CrudePrev","CHECKUP_Crude95CI"]].groupby(["StateAbbr","PlaceName","CHECKUP_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CHECKUP_CrudePrev"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## CHOLSCREEN_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CHOLSCREEN_CrudePrev","CHOLSCREEN_Crude95CI"]].groupby(["StateAbbr","PlaceName","CHOLSCREEN_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CHOLSCREEN_CrudePrev"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## COLON_SCREEN_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COLON_SCREEN_CrudePrev","COLON_SCREEN_Crude95CI"]].groupby(["StateAbbr","PlaceName","COLON_SCREEN_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COLON_SCREEN_CrudePrev"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## COPD_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COPD_CrudePrev","COPD_Crude95CI"]].groupby(["StateAbbr","PlaceName","COPD_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COPD_CrudePrev"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## COREM_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COREM_CrudePrev","COREM_Crude95CI"]].groupby(["StateAbbr","PlaceName","COREM_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COREM_CrudePrev","COREM_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## COREW_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COREW_CrudePrev","COREW_Crude95CI"]].groupby(["StateAbbr","PlaceName","COREW_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","COREW_CrudePrev","COREW_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## CSMOKING_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CSMOKING_CrudePrev","CSMOKING_Crude95CI"]].groupby(["StateAbbr","PlaceName","CSMOKING_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","CSMOKING_CrudePrev","CSMOKING_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## DENTAL_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","DENTAL_CrudePrev","DENTAL_Crude95CI"]].groupby(["StateAbbr","PlaceName","DENTAL_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","DENTAL_CrudePrev","DENTAL_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## DENTAL_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","DIABETES_CrudePrev","DIABETES_Crude95CI"]].groupby(["StateAbbr","PlaceName","DIABETES_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","DIABETES_CrudePrev","DIABETES_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## HIGHCHOL_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","HIGHCHOL_CrudePrev","HIGHCHOL_Crude95CI"]].groupby(["StateAbbr","PlaceName","HIGHCHOL_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","HIGHCHOL_CrudePrev","HIGHCHOL_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## KIDNEY_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","KIDNEY_CrudePrev","KIDNEY_Crude95CI"]].groupby(["StateAbbr","PlaceName","KIDNEY_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","KIDNEY_CrudePrev","KIDNEY_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## LPA_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","LPA_CrudePrev","LPA_Crude95CI"]].groupby(["StateAbbr","PlaceName","LPA_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","LPA_CrudePrev","LPA_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## MAMMOUSE_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","MAMMOUSE_CrudePrev","MAMMOUSE_Crude95CI"]].groupby(["StateAbbr","PlaceName","MAMMOUSE_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","MAMMOUSE_CrudePrev","MAMMOUSE_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## MHLTH_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","MHLTH_CrudePrev","MHLTH_Crude95CI"]].groupby(["StateAbbr","PlaceName","MHLTH_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","MHLTH_CrudePrev","MHLTH_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## OBESITY_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","OBESITY_CrudePrev","OBESITY_Crude95CI"]].groupby(["StateAbbr","PlaceName","OBESITY_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","OBESITY_CrudePrev","OBESITY_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## PAPTEST_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","PAPTEST_CrudePrev","PAPTEST_Crude95CI"]].groupby(["StateAbbr","PlaceName","PAPTEST_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","PAPTEST_CrudePrev","PAPTEST_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## PHLTH_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","PHLTH_CrudePrev","PHLTH_Crude95CI"]].groupby(["StateAbbr","PlaceName","PHLTH_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","PHLTH_CrudePrev","PHLTH_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## SLEEP_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","SLEEP_CrudePrev","SLEEP_Crude95CI"]].groupby(["StateAbbr","PlaceName","SLEEP_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","SLEEP_CrudePrev","SLEEP_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## STROKE_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","STROKE_CrudePrev","STROKE_Crude95CI"]].groupby(["StateAbbr","PlaceName","STROKE_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","STROKE_CrudePrev","STROKE_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

## TEETHLOST_CrudePrev

In [None]:
#display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","TEETHLOST_CrudePrev","TEETHLOST_Crude95CI"]].groupby(["StateAbbr","PlaceName","TEETHLOST_Crude95CI"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))
display(CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","TEETHLOST_CrudePrev","TEETHLOST_Crude95CI"]].groupby(["StateAbbr","PlaceName"]).agg(["max",'mean',"min"]).style.background_gradient(cmap="cool"))

# CDC 500 Cities Census Tract Data by StateAbbr	

In [None]:
def CDC_500_Cities(x):
    y = CDC_500_Cities_Census_Tract_Data[["StateAbbr","PlaceName","PlaceFIPS","TractFIPS","Place_TractID","Population2010","ACCESS2_CrudePrev","ACCESS2_Crude95CI","ARTHRITIS_CrudePrev","ARTHRITIS_Crude95CI","BINGE_CrudePrev","BINGE_Crude95CI","BPHIGH_CrudePrev","BPHIGH_Crude95CI","BPMED_CrudePrev","BPMED_Crude95CI","CANCER_CrudePrev","CANCER_Crude95CI","CASTHMA_CrudePrev","CASTHMA_Crude95CI","CHD_CrudePrev","CHD_Crude95CI","CHECKUP_CrudePrev","CHECKUP_Crude95CI","CHOLSCREEN_CrudePrev","CHOLSCREEN_Crude95CI","COLON_SCREEN_CrudePrev","COLON_SCREEN_Crude95CI","COPD_CrudePrev","COPD_Crude95CI","COREM_CrudePrev","COREM_Crude95CI","COREW_CrudePrev","COREW_Crude95CI","CSMOKING_CrudePrev","CSMOKING_Crude95CI","DENTAL_CrudePrev","DENTAL_Crude95CI","DIABETES_CrudePrev",
                                          "DIABETES_Crude95CI","HIGHCHOL_CrudePrev","HIGHCHOL_Crude95CI","KIDNEY_CrudePrev","KIDNEY_Crude95CI","LPA_CrudePrev","LPA_Crude95CI","MAMMOUSE_CrudePrev","MAMMOUSE_Crude95CI","MHLTH_CrudePrev","MHLTH_Crude95CI","OBESITY_CrudePrev","OBESITY_Crude95CI","PAPTEST_CrudePrev","PAPTEST_Crude95CI","PHLTH_CrudePrev","PHLTH_Crude95CI","SLEEP_CrudePrev","SLEEP_Crude95CI","STROKE_CrudePrev","STROKE_Crude95CI","TEETHLOST_CrudePrev","TEETHLOST_Crude95CI","Geolocation"]][CDC_500_Cities_Census_Tract_Data["StateAbbr"] == x]
    y = y.sort_values(by="Population2010",ascending=False)
    return y.head(500)

## AL

In [None]:
CDC_500_Cities("AL")

## AK

In [None]:
CDC_500_Cities("AK")


## AZ

In [None]:
CDC_500_Cities("AZ")


## AR

In [None]:
CDC_500_Cities("AR")


## CA

In [None]:
CDC_500_Cities("CA")


## CO

In [None]:
CDC_500_Cities("CO")


## CT

In [None]:
CDC_500_Cities("CT")


## DE

In [None]:
CDC_500_Cities("DE")


## DC

In [None]:
CDC_500_Cities("DC")


## FL

In [None]:
CDC_500_Cities("FL")


## GA

In [None]:
CDC_500_Cities("GA")


## HI

In [None]:
CDC_500_Cities("HI")


## ID

In [None]:
CDC_500_Cities("ID")


## IL

In [None]:
CDC_500_Cities("IL")


## IN

In [None]:
CDC_500_Cities("IN")


## IA

In [None]:
CDC_500_Cities("IA")


## KS

In [None]:
CDC_500_Cities("KS")


## KY

In [None]:
CDC_500_Cities("KY")


## LA

In [None]:
CDC_500_Cities("LA")


## ME

In [None]:
CDC_500_Cities("ME")


## MD

In [None]:
CDC_500_Cities("MD")


## MA

In [None]:
CDC_500_Cities("MA")


## MI

In [None]:
CDC_500_Cities("MI")


## MN

In [None]:
CDC_500_Cities("MN")


## MS

In [None]:
CDC_500_Cities("MS")


## MO

In [None]:
CDC_500_Cities("MO")


## MT

In [None]:
CDC_500_Cities("MT")


## NE

In [None]:
CDC_500_Cities("NE")


## NV

In [None]:
CDC_500_Cities("NV")


## NH

In [None]:
CDC_500_Cities("NH")


## NJ

In [None]:
CDC_500_Cities("NJ")


## NM

In [None]:
CDC_500_Cities("NM")


## NY

In [None]:
CDC_500_Cities("NY")


## NC

In [None]:
CDC_500_Cities("NC")


## ND

In [None]:
CDC_500_Cities("ND")


## OH

In [None]:
CDC_500_Cities("OH")


## OK

In [None]:
CDC_500_Cities("OK")


## OR

In [None]:
CDC_500_Cities("OR")


## PA

In [None]:
CDC_500_Cities("PA")


## RI

In [None]:
CDC_500_Cities("RI")


## SC

In [None]:
CDC_500_Cities("SC")


## SD

In [None]:
CDC_500_Cities("SD")


## TN

In [None]:
CDC_500_Cities("TN")


## TX

In [None]:
CDC_500_Cities("TX")

## UT

In [None]:
CDC_500_Cities("UT")


## VT

In [None]:
CDC_500_Cities("VT")


## VA

In [None]:
CDC_500_Cities("VA")


## WA

In [None]:
CDC_500_Cities("WA")

## WV

In [None]:
CDC_500_Cities("WV")

## WI

In [None]:
CDC_500_Cities("WI")

## WY

In [None]:
CDC_500_Cities("WY")