## Developed by Jonathan Ojeda 20-01-2020 QAAFI UQ

_**Functionalities:**_
* Convert data from date to Julian Day
* Get corn planting dates by US state using data from NASS-USDA
* Convert dtypes
* Locate specific rows and make a new dataframe
* Use of pivot table to transpose data
* Create campaign file (30 arc-minute (40 km approx) resolution) for planting date to be inputted in `netCDF_Write-PlantingDateV2.ipynb`

_Note: the output `pdateCampaign.csv` generated by this code is used in step 5 at `netCDF_Write-PlantingDateV2.ipynb`._

In [2]:
#import required packages
import os
import sqlite3
import datetime
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import dateutil
import pylab as py
import seaborn as sns
import scipy 
from scipy import stats
import sklearn.metrics
from numpy  import array
import glob
import functools
from functools import reduce
import matplotlib.ticker as ticker
from mpl_toolkits.axes_grid.inset_locator import (inset_axes, InsetPosition, mark_inset)
import statsmodels.api as sm
import matplotlib.patheffects as path_effects
import matplotlib.lines as mlines
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

### 1- Read sorghum planting dates estimates from NASS-USDA and put it in a dataframe

This dataframe (`df2`) shows the median sorghum planting date (DOY) by lat/lon combination in the USA.

In [23]:
df=pd.read_csv(r'C:\Users\uqjojeda\Nextcloud\PURTERRA-A0131\2020\inputs\CornPdatesLatLonFinal.txt')
df2 = df.loc[(df['DOY'] > 0)]
df2.rename(columns={'Grids40K_2':'lat','Grids40K_3':'lon'}, inplace=True)
df2.drop(['FID', 'FID_latlon', 'FID_psims4', 'psims40kmp', 'pdates3_cs',
       'pdates3__1', 'pdates3__2', 'pdates3__3', 'pdates3__4', 'pdates3__5',
       'pdates3__6', 'pdates3__7', 'pdates3__8', 'pdates3__9', 'pdates3_10',
       'pdates3_11', 'pdates3_12', 'EmptyAreas', 'EmptyAre_1', 'EmptyAre_2',
       'FID_psimsT', 'psimsTiles', 'Grids40KmL', 'Grids40K_1',
       'Grids40K_4', 'FID_EmptyA', 'EmptyAre_3', 'EmptyAre_4', 'EmptyAre_5',
       'EmptyAre_6'], axis=1, inplace=True)
df2.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Unnamed: 0,lat,lon,DOY
4019,25.25,-82.25,92.0
4020,25.25,-81.75,92.0
4021,25.25,-81.25,92.0
4022,25.25,-80.75,92.0
4023,25.25,-80.25,92.0


### 2- Read pSIMSV2 lat/lon combinations

This dataframe (`camp`) shows the lat/lon combinations for pSIMSV2 at 30 arc-minute resolution.

In [24]:
#latlon = pd.read_csv(r'C:/Users/uqjojeda/Nextcloud/PURTERRA-A0131/2020/inputs/LatLonByState.csv')
#tile = pd.read_csv(r'C:/Users/uqjojeda/Nextcloud/PURTERRA-A0131/2020/inputs/Grids40KmList.csv')
camp = pd.read_csv(r'C:/Users/uqjojeda/Nextcloud/PURTERRA-A0131/2020/inputs/CampaignLatLonList.csv')
camp.head()

Unnamed: 0,lat,lon
0,-89.75,-179.75
1,-89.25,-179.75
2,-88.75,-179.75
3,-88.25,-179.75
4,-87.75,-179.75


### 3- Merge NASS lat/lon combinations with pSIMSV2 lat/lon combinations

This dataframe (`final`) combines exactly the lat/lon combinations need to be inputted in pSIMSV2

In [22]:
final = pd.merge(camp, df2, how="outer", on=["lat","lon"])
final.head()

Unnamed: 0,lat,lon,DOY
0,-89.75,-179.75,
1,-89.25,-179.75,
2,-88.75,-179.75,
3,-88.25,-179.75,
4,-87.75,-179.75,


### 4- Fill Nan values and remove decimals

Here the Nan values are filled using the mean sorghum planting date for this example (DOY=127) and decimals are removed.

In [18]:
final[['DOY']] = final[['DOY']].fillna(axis=1, value=127)
final[['DOY']] = final[['DOY']].round(decimals=0)

In [21]:
final.head()

Unnamed: 0,lat,lon,DOY
0,-89.75,-179.75,127.0
1,-89.25,-179.75,127.0
2,-88.75,-179.75,127.0
3,-88.25,-179.75,127.0
4,-87.75,-179.75,127.0


### 5- Use pivot table to create the array for the campaign file in pSIMSV2 and export it

In [38]:
final2 = final.pivot_table(values='DOY',index='lat',columns='lon', dropna=False)
#final2.to_csv(r'C:\Users\uqjojeda\Nextcloud\PURTERRA-A0131\2020\inputs\pdateCampaign.csv')

### 6- Remember to remove the lat and lon cells before to create the campaign file for pSIMSV2!!!!