# Websites and APIs - Part 2

## [Spyder](https://docs.spyder-ide.org/current/index.html) 
[Spyder](https://docs.spyder-ide.org/current/index.html) is an interactive development environment (IDE) that offers quick feedback as you iteratively create your code. Designed by and for scientists, engineers, and data analysts, Spyder allows you to interactively write code, explore your data, and more.

# Activity: Meet the Animals - Part 2

[Meet the Animals](https://nationalzoo.si.edu/animals/list) at the Smithsonian National Zoo & Conservation Biology Institute.

Use pandas to read the `i_met_the_animals.csv` file into a dataframe and create a list of animal common_names. Then iterate through the list of common_names to gather the following elements from the webpages for each animal.

- Common name
- Scientific name
- Taxonomic information
     - Class
     - Order
     - Family
     - Genus and species
- Physical description
- Size
- Native habitat
- Conservation status
- Fun facts

### Step 1. Copyright | Terms of Use
Locate and read the terms of use for the [Smithsonian's National Zoo & Conservation Biology Institute](https://nationalzoo.si.edu/)

## Step 2. Is an API available?
Technically yes. See [Smithsonian Institution Open Access API documentation
](https://edan.si.edu/openaccess/apidocs/#api-_) 

## Step 3. Inspect the elements
Inspect the HTML. Familiarize yourself with the location of the elements listed above, and how the DOM is structured.

## Step 4. Identify Python Libraries for Project
### [requests](https://requests.readthedocs.io/en/latest/)
The [requests](https://requests.readthedocs.io/en/latest/) library retrieves HTML or XML documents from a server and processes the response. 

### [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/)

[BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/) parses HTML and XML documents, helping you search for and extract elements from the DOM. 

### [pandas](https://pandas.pydata.org/docs/user_guide/index.html)
Pandas is a large Python library used for manipulating and analyzing tabular data. 

#### [.read_csv( )](https://pandas.pydata.org/docs/reference/api/pandas.Series.to_list.html)
Reads a .csv file into pandas ...

`pd.read_csv('INSERT FILEPATH HERE')`


In [None]:
import pandas as pd
df=pd.read_csv('i_met_the_animals.csv') #df is a common abbreviation for DataFrame

df

#### [.tolist( )](https://pandas.pydata.org/docs/reference/api/pandas.Series.to_list.html)
Converts a column in a pandas DataFrame to a list.

`df.Series.tolist()`

In [None]:
animals=df.common_name.tolist()
animals

#### [.dropna( )](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dropna.html)

Drops `NaN` or Null values. Use carefully. If you do not specify an axis (rows=0, columns=1) or a subset of columns, pandas will drop all of the columns and rows with `NaN` values in your DataFrame. Consider assigning a new variable name to your DataFrame before using this method.

`DataFrame.dropna(*, axis=0, how=<no_default>, thresh=<no_default>, subset=None, inplace=False, ignore_index=False)`

#####




In [None]:
animals_without_null_common_names=df.dropna(subset='common_name')
animals_without_null_common_names

#### [.fillna( )](https://pandas.pydata.org/docs/reference/api/pandas.Series.fillna.html#pandas.Series.fillna)

Replaces `NaN` values with a value you specify.

`df.Series.fillna(value=None, *, method=None, axis=None, inplace=False, limit=None, downcast=<no_default>)`

In [None]:
null_common_names_now_empty_string=df.fillna("")
null_common_names_now_empty_string

#### [.iterrows( )](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iterrows.html#pandas.DataFrame.iterrows)

Iterates over DataFrame rows as (index, Series) pairs.

`DataFrame.iterrows()`

In [None]:
for idx, row in df.iterrows():
    print(row.common_name)

#### [.iloc](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iterrows.html#pandas.DataFrame.iterrows)

Selects row index

`DataFrame.iloc[start:end]`

In [None]:
for idx, row in df.iloc[0:1].iterrows():
    print(row.common_name)

#### [.concat( )](https://pandas.pydata.org/docs/reference/api/pandas.concat.html#pandas.concat)

Use to join DataFrames along a particular axis (rows=0, columns=1)

`pandas.concat(objs, *, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=None)`

In [None]:
results=pd.DataFrame(columns=['common_name','size'])
for idx, row in df.iterrows():
    common_name=row.common_name
    size=10
    data_row={
        'common_name':common_name,
        'size':size     
    }
    data=pd.DataFrame(data_row, index=[0])
    results=pd.concat([data, results], axis=0, ignore_index=True)

results

# BONUS: try/except

Sometimes, despite our best efforts, our code will fail to execute. A tag might be missing from a webpage, or data might be entered inconsistently. If an error occurs in the try block, Python will jump to the except block and then continue to execute your program.

In [None]:
results=pd.DataFrame(columns=['common_name','size'])
for idx, row in df.iterrows():
    try:
        common_name=row.common_name
        size=10
        data_row={
            'common_name':common_name,
            'size':size     
        }
        data=pd.DataFrame(data_row, index=[0])
        results=pd.concat([data, results], axis=0, ignore_index=True)
    except:
        common_name='no name found'
        size=0
        data_row={
                    'common_name':common_name,
                    'size':size     
                }
                data=pd.DataFrame(data_row, index=[0])
                results=pd.concat([data, results], axis=0, ignore_index=True)

results