## Getting data and basic functions

**This workbook assumes you have a recent public version of ananconda, including 'jupyter lab'.** The .yml file for the specific environment is located in this repo, however it is not required to run any of these workbooks if you are using a current version of ananconda with jupyter lab.

**Links of interest**

Scroll to the section called "Data Wrangling" [Chris Albon python basics](https://chrisalbon.com/#python)

Pandas help [Quick start](https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html)


**Questions ?** dev@hammerdirt.ch

In [1]:
# workflow and sys
import os
import sys
import requests

# data handling
import pandas as pd
import numpy as np
import datetime as dt
import json


# charting
import seaborn as sns
import matplotlib.pyplot as plt

# location variables, date variables
here = os.getcwd()
today = dt.datetime.today().strftime('%Y-%m-%d')

# create a folder for todays out put:
project_name = 'basic_functions'
todays_project = '{}_{}'.format(today, project_name)
project_directory = '{}/{}'.format(here, todays_project)

if not os.path.exists(project_directory):
    print("Making directory")
    os.makedirs(project_directory)
else:
    print("That project already exists")

That project already exists


### Example one: Get beach data from the api and group the locations

1. Get the data with requests
2. Group locations by lake/river
3. Return the number of locations for each municipality in each region

In [2]:
# 1 get the url or location of the data
location="https://mwshovel.pythonanywhere.com/api/list-of-beaches/swiss/"

# use requests.get()
beaches = requests.get(location).json()

# make a data frame with pandas
df = pd.DataFrame(beaches)

# use the groupby operator from pandas
regions = df.groupby(['water_name', 'city']).location.count()

# get a list of regions
print(F"List of water features with samples = {df.water_name.unique()}")

List of water features with samples = ['Zurichsee' 'Aare' 'Aare|Nidau-Büren-Kanal' 'Lac Léman' 'Arve'
 'Lago Maggiore' 'Thunersee' 'Untersee' 'Bielersee' 'Birs' 'Bodensee'
 'Chriesbach' 'Neuenburgersee' 'Emme' 'Walensee' 'Glatt' 'Goldach'
 'Greifensee' 'Grändelbach' 'Brienzersee' 'Inn' 'Jona' 'Katzensee'
 'Dorfbach' 'La Thièle' 'Langeten' 'Rhône' 'Limmat' 'Linthkanal'
 'Escherkanal' 'Lorze' 'Lötschebach' 'Murg' 'Ognonnaz' 'Pfaffnern' 'Reuss'
 'Rhein' 'Schiffenensee' 'Schüss' 'Seez' 'Sempachsee' 'Sense' 'Sihlsee'
 'Sihl' 'Sitter' 'Thur' 'Töss' 'Urnäsch' 'Quatre Cantons' 'Vorderrhein'
 'Zugersee' 'Zulg']


#### Now you can find the number of survey locations for each city on a lake or river

In [3]:
regions['Aare']

city
Aarau            1
Belp             1
Bern             6
Brugg            1
Gebenstorf       1
Köniz            1
Muri bei Bern    1
Rupperswil       1
Solothurn        2
Walperswil       1
Name: location, dtype: int64

### Example two: Get .csv data from a local file

1. Get the data with pd.read_csv( \<file location\> )
2. Return the number of species per family
3. Get the data relevant to family Canidae
4. Describe the family canidae with respect to the data

In [4]:
# 1 get the location '/the/path/to/file'
location = "species.csv"
species = pd.read_csv(location, low_memory=False)

# 2 return the species per family
families = species.groupby(['Family'])['Scientific Name'].count()
families['Zosteraceae']

19

In [5]:
# 3 Get the data relevant to canidae
canidae = species[species.Family == 'Canidae']

# 4 Describe the family canidae with respect to the data
print(F"There are {len(canidae)} records for canidae")
print(F"Canidae is in the category {canidae.Category.unique()} and family {canidae.Order.unique()}")
print(F"There are {len(canidae['Scientific Name'].unique())} different scientific names associated with Canidae")
print(F"The scientific names: {canidae['Scientific Name'].unique()}")

There are 200 records for canidae
Canidae is in the category ['Mammal'] and family ['Carnivora']
There are 20 different scientific names associated with Canidae
The scientific names: ['Canis latrans' 'Canis lupus' 'Vulpes vulpes' 'Urocyon cinereoargenteus'
 'Vulpes macrotis' 'Vulpes velox' 'Urocyon littoralis' 'Canis familiaris'
 'Canis' 'Urocyon' 'Vulpes' 'Vulpes fulva' 'Vulpes macrotis arsipus'
 'Alopex lagopus' 'Canis latrans lestes' 'Canis lupus youngi'
 'Canis niger' 'Canis rufus' 'Vulpes vulpes necator'
 'Vulpes vulpes cascadensis']
