## Download and Analyze Data Extracts from Websites
### Tasks:
-  Extract and unzip files from websites.
-  Open and read using applicable parameters.
-  Convert to dataframes for analysis.
-  Summarize with pivot tables.
-  Save data to a spreadsheet.
-  Create visualizations in Plotly.
-  Upload resulting files to SharePoint.

### 1. Import Modules
Import the required modules for processing and analysis.

In [1]:
import urllib.request
import zipfile
import os
import re
import numpy as np
import pandas as pd
import time
from datetime import date, timedelta
from pandas import ExcelWriter
from pandas import ExcelFile
from itertools import cycle
from plotly import tools
import plotly.offline as pyo
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, iplot
from plotly.graph_objs import *

### 2. Download and Extract

Download and extract the files from the applicable websites.

#### a. Single Audit

In [2]:
# Set the url from which the zip file will be downloaded from.
url_sa = 'https://www2.census.gov/pub/outgoing/govs/singleaudit//allfac.zip'
# Save the zip file to the current directory.
urllib.request.urlretrieve(url_sa, 'allfac.zip')

# Extract the txt files to the current directory.
zip_ref = zipfile.ZipFile(os.getcwd() + '\\allfac.zip', 'r')
zip_ref.extractall()
zip_ref.close()

#### b. Assistance Listings 

In [3]:
# Retrieve the latest assistance listings data and save to the current directory.
url_al = 'https://s3.amazonaws.com/falextracts/Assistance%20Listings/datagov/AssistanceListings_DataGov_PUBLIC_CURRENT.csv'
urllib.request.urlretrieve(url_al, 'assistance_listings.csv')

# Read into a dataframe.
# Specify the encoding parameter to avoid UnicodeDecodeError: 'utf-8' error.
df_al = pd.read_csv('assistance_listings.csv', encoding='iso8859_2')

### 3. Open and Read
For comparison, two methods to open and read the files are shown: The first utilizes a dictionary and a for loop that searches through the files in the directory; the second creates a function and calls the function on each file. 

#### a. Dictionary

In [4]:
# List comprehension to create a list of txt files.
txt_files = [file for file in os.listdir() if '.txt' in file]
# Sort the list to ensure correct variable assignemnt when called to function.
# Sort creates a copy, therefore, reassigning is unncessary.
txt_files.sort()
# Set the dictionary.
file_col={}
# Loop through the files in the current directory.
for file in txt_files:
    # Determine the file's actual number of columns, which will be used as a parameter for reading the file.
    # Open the txt file.
    open_file = open(file, 'r')
    # Read the first line (i.e., header).
    first_line = open_file.readline()
    # Split at the delimiter (i.e., comma) and count the number of elements, effectively the number of columns.
    num_col = len(first_line.split(','))
    # Set the key-value pair for each file and corresponding number of columns.
    file_col[file] = num_col

# Loop through the file_col dictionary.
# Create another dictionary to store files when read.
# To access the file, use the format: files['file_name.txt']
files={}
for file,col in file_col.items():
    # Specify the names parameter to the actual number of columns to avoid tokenizing data error, i.e., when data is being read in columns beyond the actual.
    # Specify the encoding parameter to avoid UnicodeDecodeError: 'utf-8' error.
    # Specify the quoting parameter to avoid ParserError: Error tokenizing data.
    # Specify the low_memory parameter to avoid DtypeWarning.
    read_file = pd.read_csv(file, names=list(range(0,col)), encoding='iso8859_2', quoting=3, low_memory=False)
    # The names parameter from the preceding step creates the specified range as the column names, and sets the actual column names as the first row.
    # Reassign the first row as the column names.
    read_file.columns = read_file.iloc[0]
    # Drop the column names as the first row of data.
    read_file = read_file.reindex(read_file.index.drop(0))
    # Remove whitespace from object data type columns and assign final file to dictionary.
    files[file] = read_file.apply(lambda x: x.str.strip() if x.dtype == "object" else x)

#### b. Function

In [5]:
# Define the function. Utilize the code from the preceding section.
def open_file(file):
    open_file = open(file, 'r')
    first_line = open_file.readline()
    num_col = len(first_line.split(','))
    read_file = pd.read_csv(file, names=list(range(0,num_col)), encoding='iso8859_2', quoting=3, low_memory=False)
    read_file.columns = read_file.iloc[0]
    read_file.columns = read_file.columns.str.strip()
    read_file = read_file.reindex(read_file.index.drop(0))
    return read_file.apply(lambda x: x.str.strip() if x.dtype == "object" else x)

# Call the function on all the files and assign to variables.
# Variables are arranged alphabetically to match with the sorted txt_files list.
agency,cfda,cpas,duns,eins,findings,general,passthrough = map(open_file, txt_files)

### 4. Data Cleaning and Analysis 
Analyze the data and create the applicable dataframes.

#### a. Single Audit

In [6]:
# General file
# Convert applicable column to date format.
general['FACACCEPTEDDATE'] = pd.to_datetime(general['FACACCEPTEDDATE'])
# Set the date difference to two years from the current date.
date_diff = date.today() - timedelta(730)
# Filter the general file for subnmissions within two years.
general = general[general['FACACCEPTEDDATE'].dt.date >= date_diff]
# Keep only the relevant columns.
general = general[['DBKEY','AUDITYEAR','EIN','AUDITEENAME','COGAGENCY','OVERSIGHTAGENCY','FACACCEPTEDDATE']]

# Findings file
# Set findings file to include only those that have findings (i.e., non-blank findings reference numbers).
findings = findings[findings['FINDINGSREFNUMS'].isnull() == False]
# Keep only the relevant columns.
findings = findings.drop(['ELECAUDITFINDINGSID','REPEATFINDING','PRIORFINDINGREFNUMS'],axis=1)

# Merge general with findings.
# Merge on the primary key - i.e., concatenation of audityear and dbkey. The "how" parameter is not specified since
# a general record may consist of multiple findings records; therefore, the general record should reflect or repeat
# for each matching findings record.
gen_fin = pd.merge(general,findings,on=['AUDITYEAR','DBKEY'])

# Cfda file
# Filter the cfda file for USDA programs.
cfda = cfda[cfda['CFDA'].str.contains('10\.', regex=True)]
# Keep only the relevant columns.
cfda = cfda[['CFDA','FEDERALPROGRAMNAME','AMOUNT','FINDINGSCOUNT','ELECAUDITSID']]

# Merge with the gen_fin dataframe using elecauditsid, the key that links the cfda and findings files.
# The objective is to filter for USDA programs that had findings in the last two years.
gen_fin_cfda = pd.merge(gen_fin,cfda,on=['ELECAUDITSID'])

In [7]:
# Create a function that extracts the column name of the findings columns (i.e., modified opinion, material weakness,
# etc.) if the record is a "Y" for that particular column.
def find_type(row):
    # row.name gives the index.
    x = gen_fin_cfda.columns[(gen_fin_cfda=='Y').iloc[row.name]]
    return '-'.join(x).lower()

# Set the findings columns.
find_cols = ['MODIFIEDOPINION','OTHERNONCOMPLIANCE', 'MATERIALWEAKNESS', 
            'SIGNIFICANTDEFICIENCY','OTHERFINDINGS', 'QCOSTS']
# Apply the function on the gen_fin_cfda dataframe and create one column that contains the results.
# The objective is to combine the information from six columns into one.
gen_fin_cfda['FINDINGSTYPE'] = gen_fin_cfda.loc[:,find_cols].apply(find_type,axis=1)
# Show sample records.
gen_fin_cfda.head()

#### b. Assistance Listings

In [8]:
# Round the program number to three decimal places to prevent redundant characters when converting from float to string.
df_al['Program Number'] = df_al['Program Number'].map(lambda x: '{0:.3f}'.format(x))
# Create a cfda column, which is a conversion of the program number column from float to string.
# This will be used for merging with the gen_fin_cfda dataframe, therefore, it must have the same format (i.e., string).
df_al['CFDA'] = df_al['Program Number'].astype(str)
# Filter for USDA programs.
df_al = df_al[df_al['CFDA'].str.contains('10\.', regex=True)]

# Create a function that extracts the agency name from the federal agency column and converts to an acronym.
def agency(row):
    x = row['Federal Agency (030)'].split(',')[0]
    y = row['Federal Agency (030)'].split(', ')[1]
    acp = ['AND', 'FOR', 'OF', 'THE']
    if x != 'USDA':
        agency = x
    else:
        agency = y
    return ''.join(title[0] for title in agency.split() if title not in acp)

# Apply on the df_al dataframe.
# The syntax on a dataframe column, without applying a function:
# df_al['AGENCY'] = df_al['Federal Agency (030)'].str.split(',').str[0]
df_al['AGENCY'] = df_al.apply(agency,axis=1)

# Rename program title.
df_al.rename(columns = {'Program Title':'PROGRAMNAME'}, inplace=True)
# Keep only the relevant columns.
df_al = df_al[['PROGRAMNAME','CFDA','AGENCY']]
# Show sample records.
df_al.head()

#### c. Final Dataframe

In [9]:
# Merge to create the final dataframe that matches each record with the official program name and USDA agency.
df_final = pd.merge(gen_fin_cfda,df_al,on='CFDA',how='left')
# Replace null agency names with 'UNK' and null program names with form-filled (i.e., unofficial) names.
df_final.fillna({'AGENCY':'UNK','PROGRAMNAME':df_final['FEDERALPROGRAMNAME']}, inplace=True)
# Convert applicable columns to float and int for succeeding calculations.
df_final['AMOUNT'] = df_final['AMOUNT'].astype(float)
df_final['FINDINGSCOUNT'] = df_final['FINDINGSCOUNT'].astype(int)
# Create a key column that identifies the number of unique transactions for each program in an audit year.
df_final['TRANSACTIONSCOUNT'] = df_final['AUDITYEAR'] + df_final['AMOUNT'].astype(str)
# Show sample records.
df_final.head()

Unnamed: 0,DBKEY,AUDITYEAR,EIN,AUDITEENAME,COGAGENCY,OVERSIGHTAGENCY,FACACCEPTEDDATE,ELECAUDITSID,FINDINGSREFNUMS,TYPEREQUIREMENT,...,OTHERFINDINGS,QCOSTS,CFDA,FEDERALPROGRAMNAME,AMOUNT,FINDINGSCOUNT,FINDINGSTYPE,PROGRAMNAME,AGENCY,TRANSACTIONSCOUNT
0,100242,2018,731100380,FOODLINK INC. AND SUBSIDIARIES (DBA REGIONAL ...,,10.0,2018-11-14,29374440,2018-001,E,...,Y,N,10.569,EMERGENCY FOOD ASSISTANCE PROGRAM (FOOD COMMOD...,4478803.0,1,otherfindings,Emergency Food Assistance Program (Food Commod...,FNS,20184478803.0
1,100242,2018,731100380,FOODLINK INC. AND SUBSIDIARIES (DBA REGIONAL ...,,10.0,2018-11-14,29374441,2018-001,E,...,Y,N,10.568,EMERGENCY FOOD ASSISTANCE PROGRAM (ADMINISTRAT...,464829.0,1,otherfindings,Emergency Food Assistance Program (Administrat...,FNS,2018464829.0
2,100242,2018,731100380,FOODLINK INC. AND SUBSIDIARIES (DBA REGIONAL ...,,10.0,2018-11-14,29374442,2018-001,E,...,Y,N,10.565,COMMODITY SUPPLEMENTAL FOOD PROGRAM,205926.0,1,otherfindings,Commodity Supplemental Food Program,FNS,2018205926.0
3,101351,2017,741238434,TEXAS A&M RESEARCH FOUNDATION,93.0,,2017-12-21,27977526,2017-002,M,...,N,N,10.31,AGRICULTURE AND FOOD RESEARCH INITIATIVE (AFRI),2305744.0,1,significantdeficiency,Agriculture and Food Research Initiative (AFRI),NIFA,20172305744.0
4,101654,2017,741549077,ECONOMIC OPPORTUNITIES ADVANCEMENT CORPORATION...,,93.0,2018-02-21,28259297,2017-002,B,...,Y,Y,10.558,CHILD AND ADULT CARE FOOD PROGRAM,711206.0,1,otherfindings-qcosts,Child and Adult Care Food Program,FNS,2017711206.0


#### d. Summarize
Create the pivot tables.

##### 1. By Auditee 

In [10]:
# Set aggfunc parameters. Utilize unique (or set()) to conduct operations only on the unique values (i.e., amount)
# for each CFDA.
len_uniq = lambda x: len(x.unique())
sum_uniq = lambda x: x.unique().sum()
join_uniq = lambda x: ', '.join(pd.unique(x))

# Create a pivot table by auditee.
df_piv_aud = pd.pivot_table(df_final,index=['AUDITEENAME','EIN','AUDITYEAR','COGAGENCY','OVERSIGHTAGENCY',
                                           'CFDA','PROGRAMNAME','AGENCY',],
                           values='AMOUNT',
                           aggfunc=sum_uniq)

# Sort each auditee record by descending amount - the resulting top USDA agency, i.e., the one that provided 
# the largest award, will be considered the oversight agency within USDA.
# Subsequently, sort by auditee name and audit year.
df_piv_aud = df_piv_aud.sort_values(['AMOUNT'], ascending=False).sort_values(['AUDITEENAME','AUDITYEAR'])
# Reset index to convert indeces to columns and, consequently, to a regular dataframe.
df_aud = df_piv_aud.reset_index()

# Set another pivot to extract the top USDA agency.
df_over = pd.pivot_table(df_aud,index=['EIN','AUDITYEAR'],
                         aggfunc={'AMOUNT': max,'AGENCY':'first'}).reset_index()
# Show sample records.
df_over.head()

Unnamed: 0,EIN,AUDITYEAR,AGENCY,AMOUNT
0,10282148,2017,FNS,257600.0
1,10284446,2016,FNS,501805.0
2,10288757,2016,FNS,515348.0
3,10351138,2017,RUS,1313761.0
4,10424969,2016,FNS,369903.0


##### 2. By Agency

In [11]:
# Create pivot table by agency.
df_piv_agn = pd.pivot_table(df_final,index=['AGENCY','FACACCEPTEDDATE','CFDA','PROGRAMNAME','EIN','AUDITEENAME',
                                           'AUDITYEAR','COGAGENCY','OVERSIGHTAGENCY'],
                           aggfunc={'TRANSACTIONSCOUNT':len_uniq,'AMOUNT':sum_uniq,'FINDINGSCOUNT':'mean',
                                    'TYPEREQUIREMENT':join_uniq,'FINDINGSTYPE':join_uniq,'FINDINGSREFNUMS':join_uniq})

# Stack and unstack the resulting pivot table to show the records by corresponding audit years.  
# Unstack at audit year (i.e., index 6) to convert to columns.
# df_piv_agn = df_piv_agn.stack().unstack(level=6).fillna('') #.unstack() - Further unstacking will convert aggfunc variables to subcolumns within audit years.

# Reset index to convert to a regular dataframe.
df_agn = df_piv_agn.reset_index()

# Merge df_over and df_agn to match the oversight USDA agency to each record.
df_mer_agn = pd.merge(df_agn,df_over,on=['EIN','AUDITYEAR'], how='left')

# Rename columns.
df_mer_agn.rename(columns = {'AGENCY_x':'agency','FACACCEPTEDDATE':'fac_accepted_date','CFDA':'program_number',
                             'PROGRAMNAME':'program_name','EIN':'ein','AUDITEENAME':'auditee_name',
                             'AUDITYEAR':'audit_year','COGAGENCY':'federal_cognizant',
                             'OVERSIGHTAGENCY':'federal_oversight','FINDINGSCOUNT':'findings_count',
                             'FINDINGSREFNUMS':'findings_ref_nums','FINDINGSTYPE':'findings_type',
                             'TRANSACTIONSCOUNT':'award_transactions','TYPEREQUIREMENT':'type_requirement',
                             'AMOUNT_x':'total_award','AGENCY_y':'usda_oversight', 'AMOUNT_y': 'usda_oversight_award'},
                  inplace=True)

# Rearrange columns.
df_mer_agn = df_mer_agn[['agency','fac_accepted_date','program_number','program_name','award_transactions',
                         'total_award','ein','auditee_name','audit_year','type_requirement','findings_count',
                         'findings_type','findings_ref_nums','federal_cognizant','federal_oversight','usda_oversight',
                         'usda_oversight_award']]
# Show sample records.
df_mer_agn.head()

Unnamed: 0,agency,fac_accepted_date,program_number,program_name,award_transactions,total_award,ein,auditee_name,audit_year,type_requirement,findings_count,findings_type,findings_ref_nums,federal_cognizant,federal_oversight,usda_oversight,usda_oversight_award
0,AMS,2017-02-27,10.17,Specialty Crop Block Grant Program - Farm Bill,9,94519.0,376000511,UNIVERSITY OF ILLINOIS,2016,"F, L, M, ABH, ABHL, C",6.0,"othernoncompliance-significantdeficiency, sign...","2016-014, 2016-009, 2016-008, 2016-007, 2016-0...",84,,NIFA,9250180.0
1,AMS,2017-03-07,10.17,Specialty Crop Block Grant Program - Farm Bill,79,4929947.0,943067788,UNIVERSITY OF CALIFORNIA,2016,P,1.0,othernoncompliance,2016-007,93,,UNK,55678980.0
2,AMS,2017-03-17,10.156,Federal-State Marketing Improvement Program,1,22718.0,990252020,STATE OF HAWAII DEPARTMENT OF ACCOUNTING AND ...,2016,"C, L, I",5.0,"othernoncompliance-significantdeficiency, modi...","2016-010, 2016-009, 2016-008, 2016-006, 2016-005",17,,FS,245322.0
3,AMS,2017-03-22,10.17,Specialty Crop Block Grant Program - Farm Bill,1,7230.0,726000720,STATE OF LOUISIANA,2016,"F, M",3.0,modifiedopinion-materialweakness,"2016-010, 2016-009, 2016-007",93,,FNS,1390155000.0
4,AMS,2017-03-24,10.17,Specialty Crop Block Grant Program - Farm Bill,8,111873.0,376005961,SOUTHERN ILLINOIS UNIVERSITY,2016,"ABHL, M, G",3.0,othernoncompliance-significantdeficiency,"2016-002, 2016-006, 2016-008",84,,NIFA,390059.0


### 5. Create Excel File
Save to a spreadsheet in the current directory.

In [12]:
# Create an xlsx file.
# Set timestamp. 
timestr = time.strftime('%m_%d_%Y')
# Set file name and save to current directory.
file_name = 'USDA Single Audit with Findings Report_'+ timestr + '.xlsx'
writer = pd.ExcelWriter(file_name, engine='xlsxwriter', datetime_format='yyyy-mm-dd')

# Set and sort agency list.
agencies = list(set(df_final['AGENCY']))
agencies.sort()

# Initiate workbook.
workbook = writer.book

# Create one sheet for each agency.
for agency in agencies:
    df_mer_agn[df_mer_agn['agency']==agency].to_excel(writer, agency, index=False)
    worksheet = writer.sheets[agency]
    money_fmt = workbook.add_format({'num_format': '$#,##0'})
    # Apply $ formatting on the applicable columns.
    worksheet.set_column('F:F',None,money_fmt)
    worksheet.set_column('Q:Q',None,money_fmt)

# Save the Excel file to the current directory.
writer.save()

### 6. Plotly Visualizations
Create line charts in Plotly to visualize the trend over time for award amounts and number of findings for each of the  agency's programs that had findings within the given period (i.e., two years).

In [13]:
# Create a month-year column.
df_mer_agn['month_year'] = df_mer_agn['fac_accepted_date'].dt.strftime('%B %Y')
df_mer_agn['month_year'] = pd.to_datetime(df_mer_agn['month_year'],format='%B %Y')

# Groupby month-year, agency and program number.
df_group = df_mer_agn.groupby(['month_year','agency','program_number']).agg({'findings_count': 'sum', 'total_award': 'sum'}).reset_index()
df_plot={}
for agency in agencies:
    df_plot[agency] = df_group[df_group['agency']==agency]

# Set plot colors.
colors=['mediumpurple','red','aqua','aquamarine','mediumturquoise','olive','bisque','black','rosybrown','blue',
        'blueviolet','brown','burlywood','cadetblue','chartreuse', 'chocolate','coral','cornflowerblue','plum',
        'crimson','cyan','darkblue','darkcyan','darkgoldenrod','darkgray','magenta','darkgreen','darkkhaki',
        'darkmagenta','darkolivegreen','darkorange','darkorchid','darkred','darksalmon','darkseagreen','darkslateblue',
        'darkslategray','darkslategrey','darkturquoise','darkviolet','deeppink','deepskyblue','dimgray','dimgrey',
        'dodgerblue','firebrick','floralwhite','forestgreen','fuchsia','orchid','orangered','gold','goldenrod','gray',
        'magenta','green','greenyellow','honeydew','hotpink','indianred','indigo','lime','khaki','sandybrown','purple',
        'lawngreen','salmon','lightblue','lightcoral','mediumvioletred','lightgoldenrodyellow','paleturquoise',
        'mediumaquamarine','lightgreen','lightpink','lightsalmon','lightseagreen','lightskyblue','lightslategray',
        'midnightblue','lightsteelblue']

# Assign a color to each program number to avoid auto-assigning a different color to the same number.
# Set the unique program numbers from df_group.
nums = df_group['program_number'].sort_values().unique()
# Assign the color, repeating through the colors list after each pass.
color_plot = dict(zip(nums,cycle(colors)))

In [15]:
# Set figure with subplots.
fig = tools.make_subplots(rows=len(agencies), cols=2, subplot_titles=([x for x in agencies for num in range(2)]))
# Set list holder for program numbers.
num_aw = []
# Loop through each agency.
for index, agency in enumerate(agencies):
    # Loop through each agency's program numbers.
    for num in df_plot[agency]['program_number'].unique():
        # Set x and y parameters for total award subplots.
        fig.append_trace({'x':df_plot[agency]['month_year'][df_plot[agency]['program_number']==num],
                          'y':df_plot[agency]['total_award'][df_plot[agency]['program_number']==num],
                          'name':num,
                          'legendgroup':num,
                          'showlegend':False,
                          'marker':{'color':color_plot[num]},
                          'type':'scatter'},row=index+1,col=1)
        # Append program number to list holder.
        num_aw.append(num)
        # Set x and y parameters for findings subplots.
        fig.append_trace({'x':df_plot[agency]['month_year'][df_plot[agency]['program_number']==num],
                          'y':df_plot[agency]['findings_count'][df_plot[agency]['program_number']==num],
                          'name':num,
                          'legendgroup':num,
                          # If showing legend, check num_aw list. The number shouldn't re-appear in the legend if 
                          # it exists in num_aw. Syntax: False if num in num_aw else True
                          'showlegend':False,
                          'marker':{'color':color_plot[num]},
                          'type':'scatter'},row=index+1,col=2)
        
# Set the y axis title for each subplot.
# The number of subplots is twice the number of agencies.
for i in range(len(agencies)*2): 
    # Left column subplots.
    if i%2 ==0:
        fig['layout']['yaxis'+str(i+1)].update(title='Total Award ($)')
    # Right column subplots.
    else:
        fig['layout']['yaxis'+str(i+1)].update(title='Number of Findings')

# Set title with timestamp.
fig['layout'].update(height=8000,title='<b>USDA Single Audit with Findings</b> '+'<b>'+timestr+'</b>')
# Show plots.
init_notebook_mode(connected=True)
pyo.iplot(fig, filename='Single Audit Findings Report.html')

This is the format of your plot grid:
[ (1,1) x1,y1 ]     [ (1,2) x2,y2 ]   
[ (2,1) x3,y3 ]     [ (2,2) x4,y4 ]   
[ (3,1) x5,y5 ]     [ (3,2) x6,y6 ]   
[ (4,1) x7,y7 ]     [ (4,2) x8,y8 ]   
[ (5,1) x9,y9 ]     [ (5,2) x10,y10 ] 
[ (6,1) x11,y11 ]   [ (6,2) x12,y12 ] 
[ (7,1) x13,y13 ]   [ (7,2) x14,y14 ] 
[ (8,1) x15,y15 ]   [ (8,2) x16,y16 ] 
[ (9,1) x17,y17 ]   [ (9,2) x18,y18 ] 
[ (10,1) x19,y19 ]  [ (10,2) x20,y20 ]
[ (11,1) x21,y21 ]  [ (11,2) x22,y22 ]
[ (12,1) x23,y23 ]  [ (12,2) x24,y24 ]
[ (13,1) x25,y25 ]  [ (13,2) x26,y26 ]
[ (14,1) x27,y27 ]  [ (14,2) x28,y28 ]
[ (15,1) x29,y29 ]  [ (15,2) x30,y30 ]
[ (16,1) x31,y31 ]  [ (16,2) x32,y32 ]
[ (17,1) x33,y33 ]  [ (17,2) x34,y34 ]
[ (18,1) x35,y35 ]  [ (18,2) x36,y36 ]
[ (19,1) x37,y37 ]  [ (19,2) x38,y38 ]



### 7. Upload to SharePoint
Upload the spreadsheet and html files to SharePoint.

In [None]:
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File 

url = 'https://usdagcc.sharepoint.com/sites/OCFO/TARD/'
username = ''
password=''

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username,password):
    ctx = ClientContext(url, ctx_auth)
    web = ctx.web
    ctx.load(web)
    ctx.execute_query()
    print("Web title: {0}".format(web.properties['Title']))
else:
    print(ctx_auth.get_last_error())