### Note book one

Compare the survey results for specific items indentified during shoreline surveillance of Lake Geneva. Group survey results in time by Winter, Spring, Summer or Fall, by body of water and municipality.

__Research question:__

_Does the data indicate that shoreline litter densities are seasonal? If so describe the relationship._

#### Contents

1. Setting up the environment with annaconda
2. Getting data   
3. Descriptives


### Setting up

Make sure you are running the same packages as this notebook. There is a requirements.txt file for the virutal environment in the [repo](https://github.com/hammerdirt/SWE_2019.git). 

If you spend more time managing the virtual environment than doing data stuff then try using Annaconda. It works good on all machines. [Anaconda](https://conda.io/projects/conda/en/latest/user-guide/index.html)

### Getting data

##### Imports

In [1]:
import numpy as np
import json
import csv
import pandas as pd
import matplotlib as mpl
import requests
import os

In [2]:
# The data is off of the API at https://mwshovel.pythonanywhere.com/dirt/api_home.html 
# You will have the opportunity to save the data locally
# get some file structures in place 

folders = ["Data", "Charts", "Utilities"]
here = os.getcwd()
# This will make the directory structure for you
# !! Comment this out once you have ran it !!
# def makeDirectory():
#     for folder in folders:
#         place = here +"/"+ folder
#         os.mkdir(place)
# makeDirectory()      

# never comment this out -- it is used to save output
def make_folders():
    my_folders = {}
    for folder in folders:
        place = here +"/"+ folder
        my_folders[folder] = place
    return my_folders
my_folders = make_folders()

##### Requesting data

un familiar with python requests ? go to [Requests for humans](https://requests.kennethreitz.org//en/v1.1.0/)

In [3]:
# use requests.py to fetch the data from the API
# these are the endpoints to get all the data for the lake concerning Toys, Plastic Sheeting and the locations where 
# they were identified:
end_points = [
    "http://mwshovel.pythonanywhere.com/dirt/beaches/Lac-L%C3%A9man/",
    "http://mwshovel.pythonanywhere.com/dirt/codes/Lac-L%C3%A9man/G32",
    "http://mwshovel.pythonanywhere.com/dirt/codes/Lac-L%C3%A9man/G67",    
]
variable_names=["beach_info", "get_toys", "get_sheeting"]

def getTheData():
    data = {}
    for i, name in enumerate(variable_names):
        data[name] = requests.get(end_points[i])
    return data

# store all that in a dictionary
data = getTheData()
# to get that data in an array of dictionaries call .json() on the dictionary object you want
my_beach_info = data['beach_info'].json()
# take a look at the first entry
my_beach_info[0]

{'location': 'Anarchy-Beach',
 'latitude': '46.44721600',
 'longitude': '6.85961200',
 'city': 'La-Tour-de-Peilz',
 'post': '1814',
 'water': 'l',
 'water_name': 'Lac-Léman',
 'project_id': 'MCBP',
 'owner': 'mwshovel'}

In [4]:
# need to get the code data
# the mlw codes are located here https://mwshovel.pythonanywhere.com/dirt/beach_litter.html
# there is a button that says get data in CSV format
code_data = here +"/"+ "Data/mlw_code_defs.csv"
code_defs = pd.read_csv(code_data)

# this will output the code for Toys:
code_defs.loc[code_defs.code=="G32"]

Unnamed: 0,code,material,description,source
31,G32,Plastic,Toys and fireworks,Recreation


##### Data type and structure

In [5]:
# This is the data that describes one entry for the toys category
my_sheets = data["get_sheeting"].json()
my_toys = data["get_toys"].json()
my_toys[0]

{'location_id': 'Baye-de-Montreux-G',
 'date': '2015-11-23',
 'code_id': 'G32',
 'length': 61,
 'quantity': 4,
 'project_id': 'MCBP',
 'owner': 'mwshovel'}

### Descriptive statistics

In [6]:
# this can be done with or without pandas
# the end result is always sets of ordered pairs or clusters of sets of ordered pairs
# get the median, min, max, mean of pieces per meter
# get the total pieces number per location or grouping level
# get of # of samples, # of locations, intersection of locations
# compare those to the whole lake

In [7]:
# make a pieces per meter value for each observation
def get_pieces_per_meter(a_list_of_objects):
    new_list_of_objects = []
    for this_object in a_list_of_objects:
        new_object = this_object
        new_object["pcs_m"] = np.round(new_object["quantity"]/new_object["length"], 3)
        new_list_of_objects.append(new_object)
    return new_list_of_objects
sheetting_pcs_m = get_pieces_per_meter(my_sheets)
toys_pcs_m = get_pieces_per_meter(my_toys)