# Plant List Flashcards

The purpose of this notebook is to take David's list of 500 plants and turn them into anki cards to study.

I have already cleaned this list manually in excel. Specifically I have:

1. Resolved synonyms, or removed duplicate genus names from the table. I matched to current CalFlora.
2. Fixed minor issues with ssp. vs spp. 
3. Translated acronyms for habit descriptions
4. Made native vs non-native explicit
5. Indicate with a boolean if genus or full scientific name is required

I could have done it in this notebook, but honestly I think it was faster to just go through the document. 

Now to be fair, I have not verified *each name.* There is definitely the possiblity that some of the names won't match whatver taxonomy I choose to use. 

As for the flash cards, I want to present a photo, preferably the "iconic" photo from inaturalist for the taxa, and then ask for the either the full scientific name, or just the genus. The flip side of the card will be the name. Bonus points if it has to be typed out. 

In [30]:
import pandas as pd

df = pd.read_excel('CommonPlantsBotanistMustSightID-Alphabetic-20191009.xlsx')
df

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity
0,Abies concolor,California White Fir,Tree,Pinaceae,False,Native
1,Abies magnifica,Red Fir,Tree,Pinaceae,False,Native
2,Abronia spp.,Sand-verbena,Perennial herb,Nyctaginaceae,True,Native
3,Acacia dealbata,Silver Wattle,Shrub/Tree,Fabaceae,False,Non-native
4,Acer glabrum,Rocky Mountain Maple,Shrub,Sapindaceae,False,Native
...,...,...,...,...,...,...
496,Washingtonia robusta,Mexican Fan Palm,Tree,Arecaceae,False,Non-native
497,Woodwardia fimbriata,Giant Chain Fern,Perennial fern or fern ally,Blechnaceae,False,Native
498,Wyethia spp.,Mule Ears,Perennial herb,Asteraceae,True,Native
499,Xanthium spp.,Spiny or Spring Clotbur,Annual herb,Asteraceae,True,Native


Let's check the data types and make any changes necessary.

In [31]:
df.dtypes


Scientific Name    object
Common Name        object
HabiTree           object
Family             object
Genus Only           bool
Nativity           object
dtype: object

Looks good! Now let's get the url from inaturalist

In [32]:
from pyinaturalist import *

url = get_taxa('Xanthium')['results'][0]['default_photo']['medium_url']
url = url.replace('medium', 'large')
url

[32m'https://inaturalist-open-data.s3.amazonaws.com/photos/12627/large.jpg'[0m

It looks like having "spp." in the name throws off the query. So in the following code I will simply remove that and have clean genus names only. 

In [34]:
# remove "spp." from the end of genus names in df
df['Search Name'] = df['Scientific Name']
df['Search Name'] = df['Search Name'].str.replace('spp.', '')
df

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name
0,Abies concolor,California White Fir,Tree,Pinaceae,False,Native,Abies concolor
1,Abies magnifica,Red Fir,Tree,Pinaceae,False,Native,Abies magnifica
2,Abronia spp.,Sand-verbena,Perennial herb,Nyctaginaceae,True,Native,Abronia
3,Acacia dealbata,Silver Wattle,Shrub/Tree,Fabaceae,False,Non-native,Acacia dealbata
4,Acer glabrum,Rocky Mountain Maple,Shrub,Sapindaceae,False,Native,Acer glabrum
...,...,...,...,...,...,...,...
496,Washingtonia robusta,Mexican Fan Palm,Tree,Arecaceae,False,Non-native,Washingtonia robusta
497,Woodwardia fimbriata,Giant Chain Fern,Perennial fern or fern ally,Blechnaceae,False,Native,Woodwardia fimbriata
498,Wyethia spp.,Mule Ears,Perennial herb,Asteraceae,True,Native,Wyethia
499,Xanthium spp.,Spiny or Spring Clotbur,Annual herb,Asteraceae,True,Native,Xanthium


Let's test to see if I can get rid of the "ssp." as well. I doubt it, but would love to. 

In [38]:
# filter rows that contain "ssp." in the Scientific Name column
df[df['Search Name'].str.contains("ssp.")]

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name


In [36]:
get_taxa('Sambucus nigra caerulea')['results'][0]['default_photo']['medium_url']

[32m'https://inaturalist-open-data.s3.amazonaws.com/photos/46738687/medium.jpg'[0m

Bless the inaturalist folks, they don't like ssp either!

Looks like I can safely remove that as well.

In [37]:
df['Search Name'] = df['Search Name'].str.replace('ssp.', '')
df['Search Name'] = df['Search Name'].str.replace('var.', '')
df

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name
0,Abies concolor,California White Fir,Tree,Pinaceae,False,Native,Abies concolor
1,Abies magnifica,Red Fir,Tree,Pinaceae,False,Native,Abies magnifica
2,Abronia spp.,Sand-verbena,Perennial herb,Nyctaginaceae,True,Native,Abronia
3,Acacia dealbata,Silver Wattle,Shrub/Tree,Fabaceae,False,Non-native,Acacia dealbata
4,Acer glabrum,Rocky Mountain Maple,Shrub,Sapindaceae,False,Native,Acer glabrum
...,...,...,...,...,...,...,...
496,Washingtonia robusta,Mexican Fan Palm,Tree,Arecaceae,False,Non-native,Washingtonia robusta
497,Woodwardia fimbriata,Giant Chain Fern,Perennial fern or fern ally,Blechnaceae,False,Native,Woodwardia fimbriata
498,Wyethia spp.,Mule Ears,Perennial herb,Asteraceae,True,Native,Wyethia
499,Xanthium spp.,Spiny or Spring Clotbur,Annual herb,Asteraceae,True,Native,Xanthium


Great, now that's sorted, let's generate the photo url column. 

In [42]:
df['Search Name'] = df['Search Name'].str.strip()
df

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name
0,Abies concolor,California White Fir,Tree,Pinaceae,False,Native,Abies concolor
1,Abies magnifica,Red Fir,Tree,Pinaceae,False,Native,Abies magnifica
2,Abronia spp.,Sand-verbena,Perennial herb,Nyctaginaceae,True,Native,Abronia
3,Acacia dealbata,Silver Wattle,Shrub/Tree,Fabaceae,False,Non-native,Acacia dealbata
4,Acer glabrum,Rocky Mountain Maple,Shrub,Sapindaceae,False,Native,Acer glabrum
...,...,...,...,...,...,...,...
496,Washingtonia robusta,Mexican Fan Palm,Tree,Arecaceae,False,Non-native,Washingtonia robusta
497,Woodwardia fimbriata,Giant Chain Fern,Perennial fern or fern ally,Blechnaceae,False,Native,Woodwardia fimbriata
498,Wyethia spp.,Mule Ears,Perennial herb,Asteraceae,True,Native,Wyethia
499,Xanthium spp.,Spiny or Spring Clotbur,Annual herb,Asteraceae,True,Native,Xanthium


In [43]:
from pyinaturalist import *
# function to look up the default photo for each taxa on inaturalist and convert to the large format.
def get_default_photo(scientific_name):
    try:
        url = get_taxa(scientific_name)['results'][0]['default_photo']['medium_url']
        #url = url.replace('medium', 'large')
        return url
    except Exception:
        return None

# apply the function to the taxa dataframe

df['photo_url'] = df['Search Name'].apply(get_default_photo)

In [44]:
df.head()

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name,photo_url
0,Abies concolor,California White Fir,Tree,Pinaceae,False,Native,Abies concolor,https://inaturalist-open-data.s3.amazonaws.com...
1,Abies magnifica,Red Fir,Tree,Pinaceae,False,Native,Abies magnifica,https://static.inaturalist.org/photos/86547995...
2,Abronia spp.,Sand-verbena,Perennial herb,Nyctaginaceae,True,Native,Abronia,https://inaturalist-open-data.s3.amazonaws.com...
3,Acacia dealbata,Silver Wattle,Shrub/Tree,Fabaceae,False,Non-native,Acacia dealbata,https://static.inaturalist.org/photos/12773260...
4,Acer glabrum,Rocky Mountain Maple,Shrub,Sapindaceae,False,Native,Acer glabrum,https://inaturalist-open-data.s3.amazonaws.com...


In [45]:
df[df['photo_url'].isnull()]

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name,photo_url
6,Acer negundo,Box Elder,Shrub/Tree,Sapindaceae,False,Native,Acer negundo,
90,Castilleja applegatei ssp. martinii,Martin Indian Paintbrush,Perennial herb,Orobanchaceae,False,Native,Castilleja applegatei martinii,


In [57]:
import os
import requests
from tqdm import tqdm

# Directory to save images
img_dir = 'plant_images/'
if not os.path.exists(img_dir):
    os.makedirs(img_dir)

# Function to download image
def download_image(url, filename):
    response = requests.get(url)
    if response.status_code == 200:
        with open(filename, 'wb') as file:
            file.write(response.content)
# Retry the image downloading process

# Loop through dataframe and download images
local_paths = []
for idx, row in tqdm(df.iterrows(), total=df.shape[0]):
    if pd.notna(row['photo_url']):
        local_path = os.path.join(img_dir, row['Scientific Name'].replace(' ', '_') + '.jpg')
        try:
            download_image(row['photo_url'], local_path)
            local_paths.append(local_path)
        except:
            local_paths.append(None)
    else:
        local_paths.append(None)

df['local_photo_path'] = local_paths
df



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A

Unnamed: 0,Scientific Name,Common Name,HabiTree,Family,Genus Only,Nativity,Search Name,photo_url,local_photo_path
0,Abies concolor,California White Fir,Tree,Pinaceae,False,Native,Abies concolor,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Abies_concolor.jpg
1,Abies magnifica,Red Fir,Tree,Pinaceae,False,Native,Abies magnifica,https://static.inaturalist.org/photos/86547995...,plant_images/Abies_magnifica.jpg
2,Abronia spp.,Sand-verbena,Perennial herb,Nyctaginaceae,True,Native,Abronia,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Abronia_spp..jpg
3,Acacia dealbata,Silver Wattle,Shrub/Tree,Fabaceae,False,Non-native,Acacia dealbata,https://static.inaturalist.org/photos/12773260...,plant_images/Acacia_dealbata.jpg
4,Acer glabrum,Rocky Mountain Maple,Shrub,Sapindaceae,False,Native,Acer glabrum,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Acer_glabrum.jpg
...,...,...,...,...,...,...,...,...,...
496,Washingtonia robusta,Mexican Fan Palm,Tree,Arecaceae,False,Non-native,Washingtonia robusta,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Washingtonia_robusta.jpg
497,Woodwardia fimbriata,Giant Chain Fern,Perennial fern or fern ally,Blechnaceae,False,Native,Woodwardia fimbriata,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Woodwardia_fimbriata.jpg
498,Wyethia spp.,Mule Ears,Perennial herb,Asteraceae,True,Native,Wyethia,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Wyethia_spp..jpg
499,Xanthium spp.,Spiny or Spring Clotbur,Annual herb,Asteraceae,True,Native,Xanthium,https://inaturalist-open-data.s3.amazonaws.com...,plant_images/Xanthium_spp..jpg


In [58]:

# Function to create the front of the card based on "Genus Only" column and image path
def create_front(row):
    question = "What is the genus of this plant?" if row["Genus Only"] else "What is the scientific name of this plant?"
    if pd.notna(row["local_photo_path"]):
        return f'<img src="{row["local_photo_path"]}"><br>{question}'
    else:
        return question

# Construct the Anki card format
df["Front"] = df.apply(create_front, axis=1)
df["Back"] = df["Scientific Name"]

# Save as a CSV
anki_csv_path = "anki_flashcards.csv"
df[["Front", "Back"]].to_csv(anki_csv_path, index=False, sep=";")

anki_csv_path


[32m'anki_flashcards.csv'[0m