# Data Gathering
![ ](images/gathering.png){width=20%}

Data gathering is crucial for any project. Within this tab, I will go over all the sources of my data as well as the methods I used to extract this data. 

I divided this page into five sections. Within each section, I will provided datasets that relate to the topic stated. These sections include:

 1. Why Ocean Sustainability is Necessary
 2. How the Ocean Helps Us
 3. Negative Impacts on Ocean Sustainability / Human Impact
 4. Benefits of Investing in Sustainability
 5. What is Currently Being Done to Help


I want to point out is a lot of this data came from the OECD Ocean sustainability database. To make things more manageable I created datasets for each variable I want to analyze. The datasets that came from this database are 2c, 4b, 5c, and 5b.

Within this tab I will summarize the dataset I found and what questions it could possibly answer, provide a link to the dataset, a link to the cleaned data and the data extraction method and a pictute of the data before extracted


## Why Ocean Sustainability is Necessary
**1a. Plastic Leakage by Region**
[Source](https://stats.oecd.org/viewhtml.aspx?datasetcode=PLASTIC_LEAKAGE_V2_2&lang=en)

- This data set provides estimates of plastics leakage for the 15 different global regions.
- Extraction Method: Exporting as CSV
- Link to Cleaned Data: [Here](data/01-modified-data/cleandata1a.csv)


![ ](images/data1.png){width=50%}

## How the Ocean Helps Us
**2a. Beach Attendance**
[Source](https://catalog.data.gov/dataset/swimming-beach-attendance)

- This data set covers the beach attendance in the state of New York
- Extraction Method: Exporting as CSV
- Link to Cleaned Data: [Here](data/01-modified-data/cleandata2a.csv)


![ ](images/data2.png){width=50%}

**2b. Employment in Ocean Sectors**
[Source](https://opdgig.dos.ny.gov/datasets/e25794dc9cab4fe8bd40d76a18c80f66/explore?showTable=true)

- This data illustrates 2013 ocean economy employment (number of jobs) for Northeast coastal counties, which comprise of 33 counties from Maine to New York.
- Extraction Method: Exporting as CSV


![ ](images/data10.png){width=50%}

**2c. Fishing Vessels**
[Source](https://stats.oecd.org/BrandedView.aspx?oecd_bv_id=env-data-en&doi=4c44ff65-en)

- This dataset gives numbers for the total number of fishing vessels in many countries. According to this source, ""Fishery vessels" refers to mobile floating objects of any kind and size, operating in freshwater, brackish water and marine waters which are used for catching, harvesting, searching, transporting, landing, preserving and/or processing fish, shellfish and other aquatic organisms, residues and plants. The term 'fishing vessel' is used instead when the vessel is engaged only in catching operations."
- Extraction Method: Exporting as XLS


![ ](images/data2c.png){width=40%}



## Negative Impacts on Ocean Sustainability / Human Impact
**3a. Plastic Pollution**
 [Source](https://ourworldindata.org/plastic-pollution)

 - This dataset includes global data on plastic waste generation, pollution and trade. This resource also includes data visualiation already created to help readerse understand the dataset. 
 - Extraction Method: Exporting as CSV

 ![ ](images/data3.png){width=50%}

 **3b. Mismanaged Plastic Waste**
 [Source](https://www.kaggle.com/datasets/kkhandekar/mismanaged-plastic-waste-around-the-world/data)

 - This dataset provides information on mismanaged waste on many countries. Mismanaged waste is material at high risk of entering the ocean.
 - Extraction Method: Exporting as CSV

 ![ ](images/data7.png){width=50%}


## Benefits of Investing in Sustainability

**4a. Aquaculture Production**
[Source](https://data.oecd.org/fish/aquaculture-production.htm)

- This data set provides the Aquaculture Production by Country from 2000 to 2021. 
- Extraction Method: Exporting as CSV

![ ](images/data9.png){width=50%}

**4b. Marine Landings**
[Source](https://data.world/agriculture/aquaculture-data)

- Dataset with the total number(in millions) of how much marine landings are bringing in profit wise for 40 different countries
- Extraction Method: Exporting as XLS

![ ](images/data4b.png){width=40%}

## What is Currently Being Done

**5b. Marine Pollution Act Numbers**
[Source](https://www.google.com/url?q=https://stats.oecd.org/viewhtml.aspx?datasetcode%3DOCEAN%26lang%3Den%23&sa=D&source=docs&ust=1696810131575373&usg=AOvVaw0rMGZtX0PlL1T2mnFrbzpp)

- Gives the numbers for the total MPA in a country
- Extraction Method: Export as XLS

![ ](images/data6.png){width=40%}

**5c. Money Going into Ocean Energy Research**
[Source](https://stats.oecd.org/BrandedView.aspx?oecd_bv_id=env-data-en&doi=4c44ff65-en)

- Gives the numbers for the amount of money that is going into research and development of renewable energy (in millions).
- Extraction Method: Export as XLS

![ ](images/5c.png){width=40%}

## Python API

I used the news API that we were introduced to in class. I used this API to pull news about ocean sustainability. I put the title and descriptions into text files and will do text analysis with this data. 

Link to Code: [Here](codes/01-data-gathering/Python API/apinews_gathering.ipynb)

This code also cleans the text data.

## R API

 I used the API on the website [](https://www.data.qld.gov.au/dataset/marine-oil-spills-data/resource/6d5865f0-b7fc-4770-a303-a0b1f85f661f). I used this API in R. 

 Link to Code: [R Code](codes/01-data-gathering/R_API_DATA.Rmd)


