### If you are using google colab, please run the cell below. You can run a code cell by clicking in it and then clicking the play button (right-facing arrow) or typing shift+return (or shift+enter). ONLY COLAB USERS NEED TO RUN THIS CELL:

In [1]:
!wget https://raw.githubusercontent.com/aGitHasNoName/treasure/master/animals/canidae.txt
!wget https://raw.githubusercontent.com/aGitHasNoName/treasure/master/animals/ursidae.txt

/bin/sh: wget: command not found
/bin/sh: wget: command not found


# <br><br><br>Treasures from the standard library

## <br>os module

In [2]:
import os

<br><br>Print your current working directory:

In [3]:
os.getcwd()

'/Users/colbywitherupwood/Documents/workshops/treasure'

<br><br>List the files in your current directory:

In [4]:
os.listdir()

['~$treasure.pptx',
 'treasure.pdf',
 'ursidae.txt',
 '.DS_Store',
 'treasure_answers.ipynb',
 'canidae.txt',
 'treasure.ipynb',
 '.ipynb_checkpoints',
 '.git',
 'treasure.pptx']

<br><br>Let's move the canidae.txt and ursidae.txt files into a new directory called "carnivores". These files contain lists of the living genera in each of those families.

First we can make a new directory inside our current working directory:

In [5]:
os.mkdir("carnivores")

<br><br>Then we can move the two text files into the new folder (remove the `#` from in front of your computer type - the only thing that changes is the direction of the slashes in directories):

In [14]:
#Mac:
#os.replace("canidae.txt", "carnivores/canidae.txt")
#os.replace("ursidae.txt", "carnivores/ursidae.txt")

#Windows:
#os.replace("canidae.txt", "carnivores\canidae.txt")
#os.replace("ursidae.txt", "carnivores\ursidae.txt")

<br><br>Let's see how our current working directory has changed:

In [15]:
os.listdir()

['~$treasure.pptx',
 'treasure.pdf',
 '.DS_Store',
 'treasure_answers.ipynb',
 'carnivores',
 'treasure.ipynb',
 '.ipynb_checkpoints',
 '.git',
 'treasure.pptx']

<br><br>Let's change into the carnivores directory:

In [16]:
os.chdir("carnivores")

<br><br>Confirm that you have changed directories:

In [17]:
os.getcwd()

'/Users/colbywitherupwood/Documents/workshops/treasure/carnivores'

<br><br>We can then do something with the files in that directory without having to type out the full path. I am making a list of all the genera listed in the canidae.txt file.

In [18]:
with open("canidae.txt", "r") as f:
    canidae = [line.rstrip("\n") for line in f]
print(canidae)

['Canis', 'Cuon', 'Lycaon', 'Atelocynus', 'Cerdocyon', 'Chrysocyon', 'Lycalopex', 'Speothos', 'Nyctereutes', 'Otocyon', 'Vulpes', 'Urocyon']


<br><br>Let's change back up a directory to where we used to be:

In [19]:
os.chdir("..")

<br><br>And confirm that the change worked:

In [20]:
os.getcwd()

'/Users/colbywitherupwood/Documents/workshops/treasure'

## <br><br>timeit module - time your code

timeit has several functions, but we're only going to use one which is also called timeit. To avoid having to type timeit.timeit every time, we're going to import it this way:

In [21]:
from timeit import timeit

#### <br><br>As an example, let's test which method is faster for building a list - a list comprehension, a for loop, or a lambda function. Specifically, we will make a list of the squares of every number between 1 and 10,000.

<br><br>The timeit function takes at least one argument - a piece of code **as a string** or a variable that saves a piece of code **as a string**. The function also has an argument called number, which specicifies how many times you want to run the code. It is often helpful to run the code many times when timing so that numbers are bigger and easier to compare. The default number is one million, so we are going to specify one thousand runs to save some time.

#### <br><br>List comprehension

In [22]:
timeit("[i*i for i in range(1,10001)]", number=1000)

0.5940494700000016

#### <br><br>for loop

Because this method requires multiple lines of code, we will save the code as a string called `loop_squares`. We contain the code inside triple quotes:

In [23]:
loop_squares = """
new_list = []
for i in range(1,10001):
    new_list.append(i*i)
"""

In [24]:
timeit(loop_squares, number=1000)

0.9201180630000039

<br><br>You might be thinking this isn't a fair comparison. Maybe it takes longer when the code is saved as a variable. Or maybe it takes longer to store the new list to a variable, which we didn't do with the list comprehension.

#### <br><br>List comprehension saved as variable, and with storing the list to a variable

In [25]:
list_squares = """
new_list = [i*i for i in range(1,10001)]
"""

In [26]:
timeit(list_squares, number=1000)

0.5780883999999986

#### <br><br>Lambda function

In [27]:
timeit("list(map(lambda i: i*i, range(1, 10001)))", number=1000)

0.9148368689999984

### <br><br>If your code is referencing code outside of the code you want to time:

Sometimes you'll have variable assignments, function definitions, or function import statements that are required to run the code that you want to time. This code also needs to be saved in a second variable:

In [28]:
set_up = """
canidae = ['Canis', 'Cuon', 'Lycaon', 'Atelocynus', 'Cerdocyon', 'Chrysocyon', 'Lycalopex', 'Speothos', 'Nyctereutes', 'Otocyon', 'Vulpes', 'Urocyon']
"""

In [29]:
loop_dogs = """
some_dogs = []
for i in canidae:
    if "yon" in i:
        some_dogs.append(i)
"""

<br><br>To time the code in `loop_dogs` we need to pass `timeit()` both that code and the list code that we saved as `set_up`. That setup code is passed as the keyword argument `setup`. To time the `loop_dogs` code using 1,000,000 runs (the default):

In [30]:
timeit(loop_dogs, setup=set_up)

0.6437944380000005

#### <br><br>Exercise:

In [31]:
list_dogs = """
some_dogs = [i for i in canidae if "yon" in i]
"""

Write code to time the `list_dogs` code using 1,000,000 runs.

In [32]:
timeit(list_dogs, setup=set_up)

0.5951461650000027

## <br><br>datetime

In [33]:
import datetime

In [34]:
now = datetime.datetime.now()

In [35]:
print(now)

2020-05-19 15:18:50.525315


In [36]:
now.year

2020

In [37]:
now.hour

15

<br><br>Datetime has a function `strftime` which can return the string type of lots of different data points included in your datetime. Try these out to see what they do:

In [38]:
now.strftime("%A")

'Tuesday'

In [39]:
now.strftime("%B")

'May'

In [40]:
now.strftime("%Y")

'2020'

In [41]:
now.strftime("%H")

'15'

In [42]:
now.strftime("%I")

'03'

In [43]:
now.strftime("%M")

'18'

In [44]:
now.strftime("%x")

'05/19/20'

In [45]:
now.strftime("%X")

'15:18:50'

In [46]:
now.strftime("%Z")

''

In [47]:
now.strftime("%p")

'PM'

<br><br>You can also compare dates to see which are more recent (bigger):

In [48]:
last_christmas = datetime.datetime(2019, 12, 25)

In [49]:
last_christmas

datetime.datetime(2019, 12, 25, 0, 0)

In [50]:
last_christmas > now

False

In [51]:
now > last_christmas

True

## <br><br>math and statistics

In [52]:
import math
import statistics

In [53]:
numbers = [7, 15, 80, 189, 573892]

<br><br>Let's try out some common functions on our list of numbers. Math has many more functions than statistics.

In [54]:
statistics.mean(numbers)

114836.6

In [55]:
statistics.median(numbers)

80

In [56]:
statistics.stdev(numbers)

256619.78029820696

In [57]:
[math.sqrt(i) for i in numbers]

[2.6457513110645907,
 3.872983346207417,
 8.94427190999916,
 13.74772708486752,
 757.5565985456136]

In [58]:
[math.pi * i * i for i in numbers]

[153.93804002589985,
 706.8583470577034,
 20106.192982974677,
 112220.83117888101,
 1034689910554.1246]

<br><br>There are lots of handy functions in math, so check the documentation to see them all. Here's one of my favorites:

In [59]:
math.isclose(4.01, 4.02)

False

<br><br>The default distance is very very small, but you can customize it for your particular project:

In [60]:
math.isclose(4.01, 4.02, rel_tol=.05)

True

## <br><br>random

In [61]:
import random

<br><br>random has many different functions. This gives you a random float between 0 and 1:

In [62]:
random.random()

0.8683105554977321

<br><br>random can pull a random item out of a group:

In [63]:
animals = ["parrot", "seal", "panda", "python", "raccoon dog", "amazonian short-eared dog"]

In [64]:
random.choice(animals)

'amazonian short-eared dog'

<br><br>Let's shuffle that list into a random order. WARNING - this changes the order, so if you will need the original order later, it's best to make a copy of the list first.

In [65]:
animals_copy = animals.copy()
random.shuffle(animals_copy)
animals_copy

['panda',
 'amazonian short-eared dog',
 'seal',
 'python',
 'raccoon dog',
 'parrot']

<br><br>More random items from lists:

In [66]:
random.sample(animals, 2)

['panda', 'amazonian short-eared dog']

In [67]:
random.choice(["heads", "tails"])

'heads'

<br><br>This will give you a random even number between 1 and 100:

In [68]:
random.randrange(2,101,2)

24

## <br><br>Bonus Lesson

### pathlib

<br>Remember earlier in the notebook when we had to change the code for Windows or Mac filepaths? This module makes it so you never have to worry about that. 

In [69]:
from pathlib import Path

When saving your path as a variable, use the `Path()` function. Include the path with `/` - forward slashes - even if you are on a Windows computer. It will automatically convert the path to the correct formatting for the operating system of whoever is using your code.

In [70]:
current_file_location = Path("carnivores/canidae.txt")

In [71]:
os.replace(current_file_location, "canidae.txt")

In [73]:
os.listdir()

['~$treasure.pptx',
 'treasure.pdf',
 '.DS_Store',
 'treasure_answers.ipynb',
 'carnivores',
 'canidae.txt',
 'treasure.ipynb',
 '.ipynb_checkpoints',
 '.git',
 'treasure.pptx']