# Mod 4 Project - Starter Notebook

This notebook has been provided to you so that you can make use of the following starter code to help with the trickier parts of preprocessing the Zillow dataset. 

The notebook contains a rough outline the general order you'll likely want to take in this project. You'll notice that most of the areas are left blank. This is so that it's more obvious exactly when you should make use of the starter code provided for preprocessing. 

**_NOTE:_** The number of empty cells are not meant to infer how much or how little code should be involved in any given step--we've just provided a few for your convenience. Add, delete, and change things around in this notebook as needed!

# Some Notes Before Starting

This project will be one of the more challenging projects you complete in this program. This is because working with Time Series data is a bit different than working with regular datasets. In order to make this a bit less frustrating and help you understand what you need to do (and when you need to do it), we'll quickly review the dataset formats that you'll encounter in this project. 

## Wide Format vs Long Format

If you take a look at the format of the data in `zillow_data.csv`, you'll notice that the actual Time Series values are stored as separate columns. Here's a sample: 

<img src='~/../images/df_head.png'>

You'll notice that the first seven columns look like any other dataset you're used to working with. However, column 8 refers to the median housing sales values for April 1996, column 9 for May 1996, and so on. This This is called **_Wide Format_**, and it makes the dataframe intuitive and easy to read. However, there are problems with this format when it comes to actually learning from the data, because the data only makes sense if you know the name of the column that the data can be found it. Since column names are metadata, our algorithms will miss out on what dates each value is for. This means that before we pass this data to our ARIMA model, we'll need to reshape our dataset to **_Long Format_**. Reshaped into long format, the dataframe above would now look like:

<img src='~/../images/melted1.png'>

There are now many more rows in this dataset--one for each unique time and zipcode combination in the data! Once our dataset is in this format, we'll be able to train an ARIMA model on it. The method used to convert from Wide to Long is `pd.melt()`, and it is common to refer to our dataset as 'melted' after the transition to denote that it is in long format. 

# Helper Functions Provided

Melting a dataset can be tricky if you've never done it before, so you'll see that we have provided a sample function, `melt_data()`, to help you with this step below. Also provided is:

* `get_datetimes()`, a function to deal with converting the column values for datetimes as a pandas series of datetime objects
* Some good parameters for matplotlib to help make your visualizations more readable. 

Good luck!


# Step 1: Load the Data/Filtering for Chosen Zipcodes

In [1]:
!pip install -U fsds_100719

Collecting fsds_100719
  Downloading https://files.pythonhosted.org/packages/9d/41/21fa1d020e659e4e4520b372fc530a37b18a23ab54de9d2d9e8d252f3acb/fsds_100719-0.6.4.tar.gz (65kB)
Building wheels for collected packages: fsds-100719
  Building wheel for fsds-100719 (setup.py): started
  Building wheel for fsds-100719 (setup.py): finished with status 'done'
  Stored in directory: C:\Users\zachmih\AppData\Local\pip\Cache\wheels\9c\08\dc\81d37fe3bc0cfdc61b764d03f07529670f6722a35b87a9801d
Successfully built fsds-100719
Installing collected packages: fsds-100719
  Found existing installation: fsds-100719 0.5.11
    Uninstalling fsds-100719-0.5.11:
      Successfully uninstalled fsds-100719-0.5.11
Successfully installed fsds-100719-0.6.4


You are using pip version 19.0.3, however version 19.3.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


In [3]:
from fsds_100719.imports import *

pd.set_option('display.max_columns',0)

import warnings
warnings.filterwarnings('ignore')

plt.style.use('seaborn-notebook')


In [5]:
#df = pd.read_csv('https://raw.githubusercontent.com/learn-co-students/dsc-mod-4-project-online-ds-ft-10072019/master/zillow_data.csv')
df = pd.read_csv('zillow_data.csv')
df.head().style.set_caption("Original Wide Format")

Unnamed: 0,RegionID,RegionName,City,State,Metro,CountyName,SizeRank,1996-04,1996-05,1996-06,1996-07,1996-08,1996-09,1996-10,1996-11,1996-12,1997-01,1997-02,1997-03,1997-04,1997-05,1997-06,1997-07,1997-08,1997-09,1997-10,1997-11,1997-12,1998-01,1998-02,1998-03,1998-04,1998-05,1998-06,1998-07,1998-08,1998-09,1998-10,1998-11,1998-12,1999-01,1999-02,1999-03,1999-04,1999-05,1999-06,1999-07,1999-08,1999-09,1999-10,1999-11,1999-12,2000-01,2000-02,2000-03,2000-04,2000-05,2000-06,2000-07,2000-08,2000-09,2000-10,2000-11,2000-12,2001-01,2001-02,2001-03,2001-04,2001-05,2001-06,2001-07,2001-08,2001-09,2001-10,2001-11,2001-12,2002-01,2002-02,2002-03,2002-04,2002-05,2002-06,2002-07,2002-08,2002-09,2002-10,2002-11,2002-12,2003-01,2003-02,2003-03,2003-04,2003-05,2003-06,2003-07,2003-08,2003-09,2003-10,2003-11,2003-12,2004-01,2004-02,2004-03,2004-04,2004-05,2004-06,2004-07,2004-08,2004-09,2004-10,2004-11,2004-12,2005-01,2005-02,2005-03,2005-04,2005-05,2005-06,2005-07,2005-08,2005-09,2005-10,2005-11,2005-12,2006-01,2006-02,2006-03,2006-04,2006-05,2006-06,2006-07,2006-08,2006-09,2006-10,2006-11,2006-12,2007-01,2007-02,2007-03,2007-04,2007-05,2007-06,2007-07,2007-08,2007-09,2007-10,2007-11,2007-12,2008-01,2008-02,2008-03,2008-04,2008-05,2008-06,2008-07,2008-08,2008-09,2008-10,2008-11,2008-12,2009-01,2009-02,2009-03,2009-04,2009-05,2009-06,2009-07,2009-08,2009-09,2009-10,2009-11,2009-12,2010-01,2010-02,2010-03,2010-04,2010-05,2010-06,2010-07,2010-08,2010-09,2010-10,2010-11,2010-12,2011-01,2011-02,2011-03,2011-04,2011-05,2011-06,2011-07,2011-08,2011-09,2011-10,2011-11,2011-12,2012-01,2012-02,2012-03,2012-04,2012-05,2012-06,2012-07,2012-08,2012-09,2012-10,2012-11,2012-12,2013-01,2013-02,2013-03,2013-04,2013-05,2013-06,2013-07,2013-08,2013-09,2013-10,2013-11,2013-12,2014-01,2014-02,2014-03,2014-04,2014-05,2014-06,2014-07,2014-08,2014-09,2014-10,2014-11,2014-12,2015-01,2015-02,2015-03,2015-04,2015-05,2015-06,2015-07,2015-08,2015-09,2015-10,2015-11,2015-12,2016-01,2016-02,2016-03,2016-04,2016-05,2016-06,2016-07,2016-08,2016-09,2016-10,2016-11,2016-12,2017-01,2017-02,2017-03,2017-04,2017-05,2017-06,2017-07,2017-08,2017-09,2017-10,2017-11,2017-12,2018-01,2018-02,2018-03,2018-04
0,84654,60657,Chicago,IL,Chicago,Cook,1,334200,335400,336500,337600,338500,339500,340400,341300,342600,344400,345700,346700,347800,349000,350400,352000,353900,356200,358800,361800,365700,370200,374700,378900,383500,388300,393300,398500,403800,409100,414600,420100,426200,432600,438600,444200,450000,455900,462100,468500,475300,482500,490200,498200,507200,516800,526300,535300,544500,553500,562400,571200,579800,588100,596300,604200,612200,620200,627700,634500,641000,647000,652700,658100,663300,668400,673400,678300,683200,688300,693300,698000,702400,706400,710200,714000,717800,721700,725700,729900,733400,735600,737200,739000,740900,742700,744400,746000,747200.0,748000.0,749000.0,750200.0,752300.0,755300.0,759200.0,764000.0,769600.0,775600.0,781900.0,787900.0,793200.0,798200.0,803100.0,807900.0,812900.0,818100.0,823100.0,828300.0,834100.0,839800.0,845600.0,851800.0,858000.0,864400.0,870700.0,876200.0,880700.0,884400.0,887600.0,890500.0,893300.0,895500.0,897300.0,899000.0,900400.0,902000.0,904400.0,907100.0,909700.0,911900.0,913000.0,913000.0,912000.0,909300.0,905300.0,901400.0,897900.0,895400.0,893600.0,891100.0,887000.0,881700.0,875900.0,870300.0,865100.0,859000.0,851500.0,843800.0,836400.0,830700.0,827300.0,824800.0,821600.0,818300.0,814600.0,809800.0,803600.0,795500.0,786900.0,780700,776900,774700,774200,774400,774600,775600,777800,775200,767900,764700,766100,764100,759700,754900,746200,737300,730800,729300,730200,730700,730000,730100,730100,731200,733900,735500,735400,734400,737500,737700,733700,734000,740300,744600,750500,760400,771800,780600,787900.0,794100.0,798900.0,802300.0,806100.0,810900.0,817400.0,826800.0,837900.0,848100.0,853800.0,856700.0,856600.0,854400.0,853000.0,856200.0,859700.0,863900.0,872900.0,883300.0,889500.0,892800,893600,891300,889900,891500,893000,893000,895000,901200,909400,915000,916700,917700,919800,925800,937100,948200,951000,952500,958600,966200,970400,973900,974700,972600,974300,980800,988000,994700,998700,997000,993700,991300,989200,991300,999100,1005500,1007500,1007800,1009600,1013300,1018700,1024400,1030700,1033800,1030600
1,90668,75070,McKinney,TX,Dallas-Fort Worth,Collin,2,235700,236900,236700,235400,233300,230600,227300,223400,219600,215800,211100,205700,200900,196800,193600,191400,190400,190800,192700,196000,201300,207400,212200,214600,215100,213400,210200,206100,202100,198800,196100,194100,193400,193400,193100,192700,193000,193700,194800,196100,197800,199700,201900,204500,207800,211500,214900,217800,221100,224100,226700,228200,228500,227200,224900,221900,219100,216900,215400,214500,214600,215600,217000,218400,219600,220000,219100,216800,213100,208700,204000,199600,195700,192800,190800,189600,189200,189200,189600,190300,190800,191000,190700,190300,189800,189200,188600,188000,187500.0,187200.0,187000.0,186900.0,187100.0,187700.0,188800.0,190300.0,191800.0,193000.0,193900.0,194500.0,195100.0,195700.0,196400.0,197400.0,198500.0,199600.0,200300.0,200800.0,201000.0,201000.0,201000.0,200900.0,200900.0,200900.0,201200.0,201600.0,202200.0,202700.0,203300.0,203900.0,204500.0,205100.0,205800.0,206500.0,207200.0,207800.0,208400.0,208900.0,209400.0,209700.0,210000.0,210400.0,211000.0,211600.0,212400.0,213000.0,213400.0,213600.0,213800.0,213900.0,214100.0,213900.0,213500.0,212600.0,211200.0,209500.0,207900.0,206700.0,205900.0,205300.0,204600.0,203800.0,203200.0,202400.0,201700.0,201200.0,200700.0,200000.0,199700.0,199700,199900,200100,200200,200200,200100,201300,202000,202100,202700,203700,203300,203100,202900,202400,202400,202500,202500,202400,202500,202100,201300,200700,200500,200000,199300,199100,199200,199400,199500,199600,200100,200700,201800,202700,203000,203000,203000,203100,203500.0,204600.0,205600.0,205900.0,206900.0,208500.0,209800.0,211300.0,214000.0,217200.0,220600.0,223800.0,226500.0,228600.0,230400.0,231800.0,233000.0,234200.0,235400.0,236600.0,238500.0,240500,242600,244700,246300,247600,249600,251400,253000,255200,258000,261200,264700,268400,271400,273600,275200,276400,277000,277900,280000,282600,285400,288400,290800,292000,292800,293700,295200,297000,299000,300800,301800,302800,304400,306200,307000,308000,310000,312500,314100,315000,316600,318100,319600,321100,321800
2,91982,77494,Katy,TX,Houston,Harris,3,210400,212200,212200,210700,208300,205500,202500,199800,198300,197300,195400,193000,191800,191800,193000,195200,198400,202800,208000,213800,220700,227500,231800,233400,233900,233500,233300,234300,237400,242800,250200,258600,268000,277000,283600,288500,293900,299200,304300,308600,311400,312300,311900,311100,311700,313500,315000,316700,319800,323700,327500,329900,329800,326400,320100,312200,304700,298700,294300,291400,290800,291600,293000,293600,292900,290500,286700,282200,276900,271000,264200,257000,249700,243100,237000,231700,227100,223300,220300,217300,214700,213800,215100,217300,219600,221400,222300,222700,223000.0,223700.0,225100.0,227200.0,229600.0,231800.0,233100.0,233500.0,233000.0,232100.0,231300.0,230700.0,230800.0,231500.0,232700.0,234000.0,235400.0,237000.0,238800.0,240700.0,241800.0,241700.0,240700.0,239300.0,238000.0,236800.0,235700.0,234700.0,233400.0,231700.0,230200.0,229100.0,228400.0,228700.0,229400.0,230400.0,231600.0,233000.0,234700.0,237100.0,240100.0,243000.0,244800.0,245400.0,245100.0,244900.0,245600.0,246800.0,248600.0,250600.0,252500.0,254000.0,254600.0,254100.0,252700.0,251100.0,249500.0,248300.0,247800.0,247600.0,247800.0,247900.0,247800.0,247600.0,247300.0,246700.0,246100.0,245800.0,245900.0,246200.0,246800.0,247200,247600,247900,248000,248000,249000,249200,247800,248100,250800,251700,251200,251100,250500,250000,249900,249700,247900,247400,248800,249700,249100,249200,249500,249400,249400,248900,248000,247100,246800,248600,251600,252800,252400,252600,252700,252300,252500,253400,254200.0,255200.0,256400.0,256900.0,256800.0,256700.0,257100.0,258300.0,260700.0,263900.0,267000.0,269200.0,271000.0,273100.0,275600.0,277600.0,279800.0,282100.0,284200.0,286000.0,288300.0,290700,293300,295900,298300,300200,301300,301700,302400,303600,306200,309100,311900,314100,316300,319000,322000,324300,326100,327300,327000,327200,328500,329800,330000,329000,327800,326700,325500,324700,324500,323700,322300,320700,320000,320000,320900,321000,320600,320200,320400,320800,321200,321200,323000,326900,329900
3,84616,60614,Chicago,IL,Chicago,Cook,4,498100,500900,503100,504600,505500,505700,505300,504200,503600,503400,502200,500000,497900,496300,495200,494700,494900,496200,498600,502000,507600,514900,522200,529500,537900,546900,556400,566100,575600,584800,593500,601600,610100,618600,625600,631100,636600,642100,647600,653300,659300,665800,672900,680500,689600,699700,709300,718300,727600,737100,746600,756200,765800,775100,784400,793500,803000,812500,821200,829200,837000,844400,851600,858600,865300,871800,878200,884700,891300,898000,904700,911200,917600,923800,929800,935700,941400,947100,952800,958900,965100,971000,976400,981400,985700,989400,992900,996800,1000800.0,1004600.0,1008000.0,1010600.0,1012600.0,1014500.0,1017000.0,1020500.0,1024900.0,1029800.0,1035100.0,1040500.0,1046000.0,1052100.0,1058600.0,1065000.0,1071900.0,1079000.0,1086000.0,1093100.0,1100500.0,1107400.0,1113500.0,1118800.0,1123700.0,1129200.0,1135400.0,1141900.0,1148000.0,1152800.0,1155900.0,1157900.0,1159500.0,1161000.0,1162800.0,1165300.0,1168100.0,1171300.0,1174400.0,1176700.0,1178400.0,1179900.0,1181100.0,1182800.0,1184800.0,1185300.0,1183700.0,1181000.0,1177900.0,1175400.0,1173800.0,1171700.0,1167900.0,1163000.0,1157000.0,1150800.0,1144100.0,1135600.0,1125400.0,1113900.0,1102000.0,1091900.0,1085100.0,1079200.0,1072400.0,1065400.0,1057800.0,1048900.0,1037900.0,1024300.0,1010200.0,999000,990900,985400,983200,982400,982400,984100,987100,985000,977400,973300,973700,971700,965300,955400,943600,933700,925200,923000,925000,923300,916600,912400,910400,911900,918300,923500,923600,922900,928300,928900,923900,925300,938100,951900,965400,975900,984500,994100,1001400.0,1003100.0,1002700.0,1006300.0,1013700.0,1024800.0,1038300.0,1053900.0,1070600.0,1089900.0,1108100.0,1123700.0,1135100.0,1141000.0,1143900.0,1145800.0,1147500.0,1149900.0,1155200.0,1160100.0,1163300.0,1167700,1173900,1175100,1173500,1175500,1178500,1176400,1174600,1178500,1185700,1192900,1198800,1200400,1198900,1200200,1207400,1218600,1226600,1230700,1235400,1241300,1245700,1247000,1246700,1245700,1246000,1247700,1252900,1260900,1267900,1272600,1276600,1280300,1282500,1286000,1289000,1289800,1287700,1287400,1291500,1296600,1299000,1302700,1306400,1308500,1307000
4,93144,79936,El Paso,TX,El Paso,El Paso,5,77300,77300,77300,77300,77400,77500,77600,77700,77700,77800,77900,77900,77800,77800,77800,77800,77800,77900,78100,78200,78400,78600,78800,79000,79100,79200,79300,79300,79300,79400,79500,79500,79600,79700,79900,80100,80300,80600,80900,81200,81400,81700,82100,82400,82600,82800,82900,83000,83000,82900,82800,82700,82400,82100,81900,81600,81300,81000,80800,80600,80300,80000,79800,79500,79200,78900,78600,78400,78200,78200,78200,78300,78400,78600,78900,79200,79500,79900,80300,80700,81000,81200,81400,81500,81500,81600,81700,81900,82000.0,82200.0,82500.0,82900.0,83400.0,84000.0,84700.0,85500.0,86400.0,87200.0,88000.0,88900.0,89700.0,90400.0,91100.0,91900.0,92700.0,93600.0,94400.0,95200.0,95800.0,96300.0,96700.0,97200.0,97700.0,98400.0,99000.0,99600.0,100200.0,101000.0,102000.0,103000.0,104300.0,105800.0,107400.0,109100.0,111000.0,113000.0,115000.0,117000.0,118800.0,120600.0,122200.0,124000.0,126000.0,128000.0,129600.0,130700.0,131400.0,132000.0,132300.0,132300.0,132000.0,131200.0,130300.0,129300.0,128300.0,127300.0,126300.0,125400.0,124600.0,123900.0,123300.0,122600.0,122100.0,121600.0,121200.0,120700.0,120300.0,119700.0,119100.0,118700,118400,118200,117900,117600,117400,117400,117500,117100,116100,115700,116100,116500,116700,117400,118200,118700,118800,119000,118800,118300,118100,117600,116800,116500,116100,114800,113500,112800,112700,112400,112200,112400,112800,113200,113400,113100,112800,112900,112900.0,112800.0,112700.0,113000.0,113300.0,113600.0,113500.0,113300.0,113000.0,113000.0,112900.0,112800.0,112500.0,112400.0,112000.0,111500.0,111400.0,112000.0,112500.0,112700.0,113100.0,113900,114400,114500,114400,114300,114400,114700,115000,115000,115200,115600,115900,115600,115400,115400,115500,115800,116300,116200,115600,115000,114500,114200,114000,114000,113900,114100,114900,115700,116300,116900,117300,117600,118000,118600,118900,119100,119400,120000,120300,120300,120300,120300,120500,121000,121500


In [7]:
import functions_mod4proj as ji
help(ji)

Help on module functions_mod4proj:

NAME
    functions_mod4proj

FUNCTIONS
    get_model_metrics(true, preds, train, explain_U=False)
    
    get_train_test_split_index(ts, TEST_SIZE=0.2)
    
    make_dateindex(df_to_add_index, index_col='Month', index_name='date', drop=True, freq=None, verbose=True)
    
    melt_data(df)
    
    meta_grid_search(ts, TEST_SIZE=0.2, model_kws={}, verbose=True, return_kws=False)
    
    plotly_timeseries(df, x='datetime', y='MeanValue', color='RegionID', line_group='State')
    
    stationarity_check(TS, plot=True, col=None)
        From: https://learn.co/tracks/data-science-career-v2/module-4-a-complete-data-science-project-using-multiple-regression/working-with-time-series-data/time-series-decomposition
    
    thiels_U(ys_true=None, ys_pred=None, display_equation=True, display_table=True)
        Calculate's Thiel's U metric for forecasting accuracy.
        Accepts true values and predicted values.
        Returns Thiel's U

FILE
    c:\users\

In [16]:
def melt_data(df):
    melted = pd.melt(df, id_vars = ['RegionID', 'RegionName', 'City', 'State', 'Metro', 'CountyName',
                                    'SizeRank'],var_name = 'Month', value_name = 'MeanValue')
    melted['Month'] = pd.to_datetime(melted['Month'], format='%Y-%m')
    melted = melted.dropna(subset = ['MeanValue'])
    return melted



In [17]:
df = melt_data(df)
df.head().style.set_caption("MELTED LONG FORMAT")

Unnamed: 0,RegionID,RegionName,City,State,Metro,CountyName,SizeRank,Month,MeanValue
0,84654,60657,Chicago,IL,Chicago,Cook,1,1996-04-01 00:00:00,334200
1,90668,75070,McKinney,TX,Dallas-Fort Worth,Collin,2,1996-04-01 00:00:00,235700
2,91982,77494,Katy,TX,Houston,Harris,3,1996-04-01 00:00:00,210400
3,84616,60614,Chicago,IL,Chicago,Cook,4,1996-04-01 00:00:00,498100
4,93144,79936,El Paso,TX,El Paso,El Paso,5,1996-04-01 00:00:00,77300


# Step 2: Data Preprocessing

In [19]:
def make_dateindex(df_to_add_index, index_col='Month',
                  index_name = 'date', drop = True, verbose = True):
    '''Converts the index_col to a datetime index with nae = index_name'''
    
    # Copy input df and reset index
    df = df_to_add_index.copy()
    df.reset_index(drop=True)
    
    # Make datetime column to make an index
    df[index_name] = pd.to_datetime(df[index_col], errors = 'coerce')
    
    
    # assign index
    df = df.set_index(index_name, drop = drop)
    
    if verbose:
        display(df.index)
    return df

df = make_dateindex(df, index_col = 'Month', index_name = 'date')
df

DatetimeIndex(['1996-04-01', '1996-04-01', '1996-04-01', '1996-04-01',
               '1996-04-01', '1996-04-01', '1996-04-01', '1996-04-01',
               '1996-04-01', '1996-04-01',
               ...
               '2018-04-01', '2018-04-01', '2018-04-01', '2018-04-01',
               '2018-04-01', '2018-04-01', '2018-04-01', '2018-04-01',
               '2018-04-01', '2018-04-01'],
              dtype='datetime64[ns]', name='date', length=3744704, freq=None)

Unnamed: 0_level_0,RegionID,RegionName,City,State,Metro,CountyName,SizeRank,Month,MeanValue
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1996-04-01,84654,60657,Chicago,IL,Chicago,Cook,1,1996-04-01,334200.0
1996-04-01,90668,75070,McKinney,TX,Dallas-Fort Worth,Collin,2,1996-04-01,235700.0
1996-04-01,91982,77494,Katy,TX,Houston,Harris,3,1996-04-01,210400.0
1996-04-01,84616,60614,Chicago,IL,Chicago,Cook,4,1996-04-01,498100.0
1996-04-01,93144,79936,El Paso,TX,El Paso,El Paso,5,1996-04-01,77300.0
1996-04-01,91733,77084,Houston,TX,Houston,Harris,6,1996-04-01,95000.0
1996-04-01,61807,10467,New York,NY,New York,Bronx,7,1996-04-01,152900.0
1996-04-01,84640,60640,Chicago,IL,Chicago,Cook,8,1996-04-01,216500.0
1996-04-01,91940,77449,Katy,TX,Houston,Harris,9,1996-04-01,95400.0
1996-04-01,97564,94109,San Francisco,CA,San Francisco,San Francisco,10,1996-04-01,766000.0


# Step 3: EDA and Visualization

In [None]:
font = {'family' : 'normal',
        'weight' : 'bold',
        'size'   : 22}

matplotlib.rc('font', **font)

# NOTE: if you visualizations are too cluttered to read, try calling 'plt.gcf().autofmt_xdate()'!

# Step 4: Reshape from Wide to Long Format

In [None]:
def melt_data(df):
    melted = pd.melt(df, id_vars=['RegionName', 'City', 'State', 'Metro', 'CountyName'], var_name='time')
    melted['time'] = pd.to_datetime(melted['time'], infer_datetime_format=True)
    melted = melted.dropna(subset=['value'])
    return melted.groupby('time').aggregate({'value':'mean'})

# Step 5: ARIMA Modeling

# Step 6: Interpreting Results