# 🔎 Finding Datasets with PySyft 0.9.1b
---

## 🖼 Scenario

You are currently conducting a project to understand the relationship between the percentage of jobs within the private sector and the percentage of people living under the poverty level amongst U.S. citizens in the year 2015. Additionally you want to investigate the relationship of gender within those above the poverty line and those below. You have been given access to a statistics Datasite called US Stats to find datasets that can aid you in your research.

## 😎 Mission

Your mission for this exercise is to use Syft’s Python API to explore the US Stats Datasite and find dataset(s) that best fit your study. While you search  keep an eye out for …
- Relevance
- Quality
- Trustworthiness
- Representation
- Datasets that are available for academic research

Please # comment explaining your process as you go. After you have found dataset(s) that you believe would fit your study, please list them in the __“Mission Response”__ section at the bottom of this notebook. When you are done please answer the __“Post-Test”__ section at the bottom of this notebook and then notify your moderator.

_Disclaimer: The datasets provided have been modified for the purposes of this test and are not an accurate reflection of the source data collected by the U.S. Census Bureau or any other contributors mentioned therein._

### Helpful Resources
- [PySyft Documentation](https://docs.openmined.org/en/latest/index.html)
- [PySyft Repo](https://github.com/OpenMined/PySyft)

#####
---


# 🖥 Setup Test Environment

### Run in CoLab (opt)
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/OpenMined/design/blob/main/user_tests/finding_data_091b/Finding%20Datasets%20with%20PySyft091b.ipynb) 

### 1. Install Syft

Before you begin the test you will need to have PySyft installed. If you do not have PySyft installed you can run the cell below or reference this [Quick Install Guide](https://docs.openmined.org/en/latest/quick_install.html) to get started.

In [1]:
# The script below will install PySyft(beta) to your machine
#!pip install -U syft --pre

### 2. Deploy Test Env
Please run the cells below to deploy a local instance of the test environment.

In [None]:
# IMPORT DEPENDENCIES
import syft as sy
import pandas as pd

# DOWNLOAD "finding_data_setup" TEST SCRIPT
!curl -O https://raw.githubusercontent.com/OpenMined/design/main/user_tests/finding_data_091b/finding_data_setup.py

# IMPORT TEST SCRIPT
import finding_data_setup as fd_setup

# lAUNCH lOCAL SERVER
server = sy.orchestra.launch(
    name = "U.S. Stats",
    reset=True,
    port= 8080
)

# CREATE ADMIN PROFILE & DATASITE PROFILE
fd_setup.setup_datasite(port= 8080)

# UPLOAD DATASETS TO DATASITE
# This may take a second as it loads the data.
fd_setup.upload_datasets(port= 8080)

---

# 🚩 Begin Mission!

### 1. Begin Searching

In [None]:
# Login to the Datasite as a guest user
guest= sy.login_as_guest(port=8080)

In [None]:
# Begin mission to search for datasets
guest

In [1]:
# Add as many cells as you need ^_^

#####
---
## ✏ Post-Test Response
### 1. Mission Response
In a cell below please list the dataset(s) you think would best suit the research question posed above in the **Scenario**.

In [None]:
# List datasets here


### 2. Post-Test Survey

Please **upload your notebook** and tell us about your experience in the [**→→ form here ←←**](https://forms.gle/9HqucvjuHaKEuz777) to conclude the test.


#####
### 🛑 Shutdown Test Environment
Run the cell below to shutdown the local instance of the test environment.

In [None]:
# The following command will shutdown the local test server
server.land()