# Climate change's effect on local food and water resources. 📝

![Banner](./assets/banner.jpeg)

## Topic
*What problem are you (or your stakeholder) trying to address?*
📝 <!-- Answer Below -->

This project will attempt to address if localized climate change has a detrimental effect on local crop production and water resources.

## Project Question
*What specific question are you seeking to answer with this project?*
*This is not the same as the questions you ask to limit the scope of the project.*
📝 <!-- Answer Below -->

1. Is climate change happening in my local area of Southeast Indiana?
2. Is climate change affecting the yield production of local crops, e.g., corn and soybean?
3. Is climate change affecting the water resources, i.e., is there an abundance or scarcity of water based on the amount of yearly rainfall? 

## What would an answer look like?
*What is your hypothesized answer to your question?*
📝 <!-- Answer Below -->

I hypothesize that climate change is occuring in Southeast Indiana, and it has affected local crop yield and reduced a percentage of available water resources.

## Data Sources
*What 3 data sources have you identified for this project?*
*How are you going to relate these datasets?*
📝 <!-- Answer Below -->

1. United States Department of Agriculture (USDA) - https://quickstats.nass.usda.gov
2. Local Climate Analysis Tool (LCAT) - https://lcat.nws.noaa.gov/home
3. National Centers for Environmental Information | Local Climatological Data (LCD) - https://www.ncei.noaa.gov/cdo-web/datatools/lcd

I will relate these data sets by geographical locations, either at the state level or preferably at the county level. 

## Approach and Analysis
*What is your approach to answering your project question?*
*How will you use the identified data to answer your project question?*
📝 <!-- Start Discussing the project here; you can add as many code cells as you need -->

My approach is to join the climate data set to the agricultural data set on the specified region, Southeast Indiana.  I need to fully understand the NOAA weather data's schema to determine if my county, Ripley, is included in the Wilmington Station, which it should be.

Once I have identified that Wilmington Weather Station does indeed cover my local area, I will attempt to extrapolate the annual rainfall and temperature data to determine if there is a correlation between weather and crop yield.

In [50]:
# Start your code here

import pandas as pd
import numpy as np

import os
from dotenv import load_dotenv

# Load the project environment variables
load_dotenv(override=True)

import requests
from urllib.request import urlretrieve, urlparse
from bs4 import BeautifulSoup

# Configure pandas to display 500 rows; otherwise it will truncate the output
pd.set_option('display.max_rows', 500)

"This product uses the NASS API but is not endorsed or certified by NASS."

In [44]:
# URL='https://quickstats.nass.usda.gov/results/5707E545-6B9E-35A4-AF77-DAF0BA7D7A7B'

API_KEY = os.getenv('API_KEY')

# url = 'https://quickstats.nass.usda.gov/api/api_GET/?key=API_KEY&commodity_desc=CORN&year__GE=2010&state_alpha=VA'
url = 'https://quickstats.nass.usda.gov/api/api_GET/'
params = {
    "key":API_KEY,
    "commodity_desc":"CORN",
    "year__GE":"2010",
    "state_alpha":"IN"
}
response = requests.get(url=url, params=params)
data = response.json()
data[0]
# nass_df = pd.DataFrame.from_records(response.json()[1:],columns=response.json()[0])
# nass_df.head()

[('data',
  [{'watershed_desc': '',
    'source_desc': 'CENSUS',
    'load_time': '2018-02-01 00:00:00.000',
    'county_name': '',
    'statisticcat_desc': 'SALES',
    'domain_desc': 'AREA OPERATED',
    'week_ending': '',
    'watershed_code': '00000000',
    'sector_desc': 'CROPS',
    'year': 2017,
    'state_alpha': 'IN',
    'asd_code': '',
    'end_code': '00',
    'state_fips_code': '18',
    'domaincat_desc': 'AREA OPERATED: (1,000 TO 1,999 ACRES)',
    'Value': '984,853,000',
    'class_desc': 'ALL CLASSES',
    'country_code': '9000',
    'state_ansi': '18',
    'util_practice_desc': 'ALL UTILIZATION PRACTICES',
    'unit_desc': '$',
    'zip_5': '',
    'asd_desc': '',
    'country_name': 'UNITED STATES',
    'state_name': 'INDIANA',
    'group_desc': 'FIELD CROPS',
    'location_desc': 'INDIANA',
    'reference_period_desc': 'YEAR',
    'agg_level_desc': 'STATE',
    'commodity_desc': 'CORN',
    'begin_code': '00',
    'county_code': '',
    'short_desc': 'CORN - SALES, 

In [49]:
nass_df = pd.DataFrame.from_records(data=data)
nass_df

Unnamed: 0,data
0,"{'watershed_desc': '', 'source_desc': 'CENSUS'..."
1,"{'freq_desc': 'ANNUAL', 'CV (%)': '2.1', 'coun..."
2,"{'sector_desc': 'CROPS', 'year': 2017, 'state_..."
3,{'prodn_practice_desc': 'ALL PRODUCTION PRACTI...
4,"{'asd_code': '', 'state_alpha': 'IN', 'year': ..."
...,...
24927,"{'state_alpha': 'IN', 'asd_code': '', 'sector_..."
24928,"{'county_code': '', 'congr_district_code': '',..."
24929,"{'group_desc': 'FIELD CROPS', 'country_name': ..."
24930,"{'watershed_code': '00000000', 'year': 2010, '..."


In [22]:
lcd_df = pd.read_csv('./data/ncei_lcd_data.csv')

  lcd_df = pd.read_csv('./data/ncei_lcd_data.csv')


In [23]:
lcd_df.describe()

Unnamed: 0,STATION,AWND,BackupDirection,BackupDistance,BackupDistanceUnit,BackupElements,BackupElevation,BackupElevationUnit,BackupEquipment,BackupLatitude,...,ShortDurationPrecipitationValue045,ShortDurationPrecipitationValue060,ShortDurationPrecipitationValue080,ShortDurationPrecipitationValue100,ShortDurationPrecipitationValue120,ShortDurationPrecipitationValue150,ShortDurationPrecipitationValue180,Sunrise,Sunset,WindEquipmentChangeDate
count,125503.0,119.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,109.0,109.0,109.0,109.0,109.0,109.0,109.0,3652.0,3652.0,0.0
mean,72429790000.0,5.801681,,,,,,,,,...,0.553853,0.613486,0.675872,0.729266,0.769358,0.82211,0.87156,632.117196,1854.235761,
std,0.0,1.363258,,,,,,,,,...,0.449718,0.494521,0.517793,0.528328,0.532013,0.538618,0.553258,89.232974,98.73737,
min,72429790000.0,3.4,,,,,,,,,...,0.06,0.07,0.08,0.09,0.12,0.14,0.14,511.0,1715.0,
25%,72429790000.0,4.7,,,,,,,,,...,0.28,0.31,0.35,0.38,0.42,0.48,0.53,539.0,1748.0,
50%,72429790000.0,5.8,,,,,,,,,...,0.46,0.49,0.55,0.59,0.61,0.68,0.73,629.0,1846.0,
75%,72429790000.0,6.9,,,,,,,,,...,0.64,0.7,0.79,0.87,0.92,0.99,1.08,725.0,1939.0,
max,72429790000.0,8.3,,,,,,,,,...,2.47,2.9,3.11,3.15,3.15,3.15,3.21,757.0,2008.0,


In [24]:
lcd_df.head()

Unnamed: 0,STATION,DATE,REPORT_TYPE,SOURCE,AWND,BackupDirection,BackupDistance,BackupDistanceUnit,BackupElements,BackupElevation,...,ShortDurationPrecipitationValue045,ShortDurationPrecipitationValue060,ShortDurationPrecipitationValue080,ShortDurationPrecipitationValue100,ShortDurationPrecipitationValue120,ShortDurationPrecipitationValue150,ShortDurationPrecipitationValue180,Sunrise,Sunset,WindEquipmentChangeDate
0,72429793812,2013-10-01T00:09:00,FM-16,7,,,,,,,...,,,,,,,,,,
1,72429793812,2013-10-01T00:11:00,FM-16,7,,,,,,,...,,,,,,,,,,
2,72429793812,2013-10-01T00:21:00,FM-16,7,,,,,,,...,,,,,,,,,,
3,72429793812,2013-10-01T00:32:00,FM-16,7,,,,,,,...,,,,,,,,,,
4,72429793812,2013-10-01T00:40:00,FM-16,7,,,,,,,...,,,,,,,,,,


In [25]:
lcd_df.sample(10)

Unnamed: 0,STATION,DATE,REPORT_TYPE,SOURCE,AWND,BackupDirection,BackupDistance,BackupDistanceUnit,BackupElements,BackupElevation,...,ShortDurationPrecipitationValue045,ShortDurationPrecipitationValue060,ShortDurationPrecipitationValue080,ShortDurationPrecipitationValue100,ShortDurationPrecipitationValue120,ShortDurationPrecipitationValue150,ShortDurationPrecipitationValue180,Sunrise,Sunset,WindEquipmentChangeDate
59691,72429793812,2018-07-05T04:10:00,FM-16,7,,,,,,,...,,,,,,,,,,
98251,72429793812,2021-08-04T00:53:00,FM-15,7,,,,,,,...,,,,,,,,,,
124836,72429793812,2023-09-14T03:53:00,FM-15,7,,,,,,,...,,,,,,,,,,
1175,72429793812,2013-10-28T22:53:00,FM-15,7,,,,,,,...,,,,,,,,,,
6738,72429793812,2014-04-15T07:48:00,FM-16,7,,,,,,,...,,,,,,,,,,
104829,72429793812,2022-02-09T23:53:00,FM-15,7,,,,,,,...,,,,,,,,,,
111653,72429793812,2022-08-31T04:51:00,FM-16,6,,,,,,,...,,,,,,,,,,
101868,72429793812,2021-11-07T23:53:00,FM-15,7,,,,,,,...,,,,,,,,,,
100348,72429793812,2021-10-01T06:53:00,FM-15,7,,,,,,,...,,,,,,,,,,
100264,72429793812,2021-09-29T07:15:00,FM-16,7,,,,,,,...,,,,,,,,,,


IndyStar, Ripley County, Indiana Aggregated Weather Data

In [51]:
indy_star_summary_url = 'https://data.indystar.com/weather-data/ripley-county/18137/2023-07-01/?syear=1895&eyear=2023#summary'
indy_star_table_url = 'https://data.indystar.com/weather-data/ripley-county/18137/2023-07-01/table/'

In [None]:
page = requests.get(indy_star_table_url)
soup = BeautifulSoup(page.content, 'html.parser')
# print(soup.prettify())

In [None]:
tables = soup.find_all('table')
weather_data = pd.read_html(str(tables[0]))[0]
weather_data.head()

## Resources and References
*What resources and references have you used for this project?*
📝 <!-- Answer Below -->
* Data source references listed above
* Bing Chat with GPT-4
* https://www.ncei.noaa.gov/data/local-climatological-data/doc/LCD_documentation.pdf
* https://www.ncdc.noaa.gov/cdo-web/datasets
*

In [2]:
# ⚠️ Make sure you run this cell at the end of your notebook before every submission!
!jupyter nbconvert --to python source.ipynb

[NbConvertApp] Converting notebook source.ipynb to python
[NbConvertApp] Writing 1271 bytes to source.py
