## Python Data Access

On the Python side we’ll work with the same real-world data on North American bumblebees. We'll use Python in SAS Viya Workbench to read four CSV files into dataframes, making it easy to manipulate, explore, and analyze trends and patterns in pollinator populations.

### Importing an image In Python

In [7]:
from IPython.display import display, HTML

# Title
display(HTML('''
<h3 style="text-align:center; font-size:16px;">Rusty Patched Bumble Bee</h3>
'''))

# Insert image with styling
display(HTML('''
<img src="https://www.canr.msu.edu/home_gardening/uploads/images/Photo1-Rusty.jpg?language_id=1" 
     alt="Rusty Patched Bumble Bee"
     style="width:660px;height:433px;border:2px solid #ccc;border-radius:10px;">
'''))

Bumble bees are vital pollinators for wildflowers and crops, thriving in cooler temperatures and low light. Their unique "buzz pollination" technique—vibrating flowers to release pollen—benefits plants like tomatoes, peppers, and cranberries.

Unfortunately, bumble bee populations are in sharp decline. Recent research by the Xerces Society and the IUCN Bumble Bee Specialist Group shows that over 28% of North American species face extinction risks. While some species have gained conservation support, others, like the Suckley and variable cuckoo bumble bees, remain overlooked.

Learn more about efforts to protect the <a href='https://www.xerces.org/rusty-patched-bumble-bee/'>rusty patched bumble bee  here</a> and explore this <a href='https://storymaps.arcgis.com/stories/c5e591a19eb24d28af483ede7b174434'>story map</a>.

#### Reading a CSV File into a DataFrame

Reading a CSV file into a DataFrame is the first step in data analysis using pandas. This task involves loading data from a CSV file into a pandas DataFrame, which provides a powerful and flexible data structure for data manipulation and analysis. The read_csv function is used to read the CSV file, making the data easily accessible for various operations such as filtering, grouping, and aggregating.

Pandas is a powerful Python library used for data manipulation, cleaning, and analysis, especially with structured data like tables and spreadsheets.

In [2]:
# Import the pandas library for data manipulation and analysis
import pandas as pd

In [None]:
# Read the North American bumblebee CSV file into a DataFrame for easy data manipulation and analysis.

df1=pd.read_csv('/workspaces/myfolder/SASInnovate25/pattern_decline_N_American_Bumblebees.csv', encoding='latin-1')

  df1=pd.read_csv('/workspaces/myfolder/SASPythonDataScientists/pattern_decline_N_American_Bumblebees.csv', encoding='latin-1')


This warning means columns 6 and 16 have mixed data types (e.g., numbers and text). You can resolve it by specifying the correct dtype or using low_memory=False to process the file in chunks. Specifying dtype is more precise, while low_memory=False is a quick but less reliable fix.

In [3]:
# Read the North American bumblebee CSV file into a DataFrame for easy data manipulation and analysis, forcing column 6 and 16 to be strings
df1=pd.read_csv('/workspaces/myfolder/SASInnovate25/pattern_decline_N_American_Bumblebees.csv', dtype={6: str, 16: str}, encoding='latin-1')

In [12]:
# Read the Mexican bumblebee CSV file into a DataFrame for easy data manipulation and analysis.
df2=pd.read_csv('/workspaces/myfolder/SASInnovate25/pattern_decline_Mexican_Bumblebees.csv' , encoding='latin-1')

In [5]:
# Read the scientific and common name lookup csv file into a DataFrame for easy data manipulation and analysis.
df3=pd.read_csv('/workspaces/myfolder/SASInnovate25/Bumblebee_Others_Scientific_Common_Names.csv' , encoding='latin-1')

In [6]:
# Read the native vs non native bee data into a DataFrame for easy data manipulation and analysis.
df4=pd.read_csv('/workspaces/myfolder/SASInnovate25/native_vs_nonnative_bumblebee_sighting_pollinators_of_farm_data_for_publication.csv' , encoding='latin-1')