# 📉 Answering Questions on Data You Cannot See with PySyft 0.9.1b
---

## 🖼 Scenario

You have been given access to US Stats's Datasite as an external researcher. On their Datasite you have found a dataset that contains information about different famous personalities including occupation, birth year, death year, country and associated country life expectancy. You are wanting to learn more about whether certain trends exist between the age of death and the occupation the famous personality held.

## 😎 Mission

Your mission for this exercise is to take on the role of an external researcher and use Pysyft's API to write and submit code requests that answer the following questions:

- What are the top 10 occupations with the shortest lifespans?
- What is the average lifespan of the occupation with the shortest lifespan?
- How does the average lifespan of a "Politician" compare to that of the average "Associated Country Life Expectancy"?

This test consists of two parts.

**Part One**
- You will need to use the mock data provided on the Datasite to form your code
- You will need to submit your code for all 3 questions for review
- After you submit your code notify your moderator who will be playing the role of Data Owner

**Part Two**
- Your moderator will review your code, run it on the real data, and then notify you when your code results are ready
- You will need to review the final results and provide your answers to the questions above in the **Mission Response** section

_*Disclaimer: The dataset provided has been modified for the purposes of this test and is not an accurate reflection of the [source data](https://workshop-proceedings.icwsm.org/abstract?id=2022_82) collected in the ICWSM workshop_

### Helpful Resources
- [PySyft Documentation](https://docs.openmined.org/en/latest/index.html)
- [PySyft Repo](https://github.com/OpenMined/PySyft)

#####
---


# 🖥 Setup Test Environment

### Run in CoLab (opt)
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/OpenMined/design/blob/main/user_tests/remote_data_science_091b/Remote%20Data%20Science%20with%20PySyft091b.ipynb) 

### 1. Install Syft

Before you begin the test you will need to have PySyft installed. If you do not have PySyft installed you can run the cell below or reference this [Quick Install Guide](https://docs.openmined.org/en/latest/quick_install.html) to get started.

In [1]:
# The script below will install PySyft(beta) to your machine
#!pip install -U syft --pre

### 2. Deploy Test Env
Please run the cells below to deploy a local instance of the test environment.

In [1]:
# IMPORT DEPENDENCIES
import syft as sy
import pandas as pd

In [None]:
# DOWNLOAD TEST SCRIPT
!curl -O https://raw.githubusercontent.com/OpenMined/design/main/user_tests/remote_data_science_091b/remote_exec_setup.py

In [2]:
# IMPORT TEST SCRIPT
import remote_exec_setup as re_setup

Test Environment functions have been created!


In [3]:
# lAUNCH lOCAL SERVER
server = sy.orchestra.launch(
    name = "U.S. Stats",
    reset=True,
    port= 8080
)

Starting U.S. Stats server on 0.0.0.0:8080
Waiting for server to start...................

In [4]:
# CREATE ADMIN PROFILE & DATASITE PROFILE
re_setup.setup_datasite(port= 8080)

Logged into <U.S. Stats: High side Datasite> as <info@openmined.org>


Admin Moderator Chan has been created! (1/2)
Datasite profile has been updated! (2/2)


In [5]:
# UPLOAD DATASETS TO DATASITE
# This may take a second as it loads the data.
re_setup.upload_datasets(port= 8080)

Loading data from 'github.com/OpenMined/design/user_tests/remote_data_science_091b/assets'...
Data has been locally loaded! (1/4)
Defining dataset metadata...
Dataset metadata has been defined! (2/4)
Logged into <U.S. Stats: High side Datasite> as <info@openmined.org>


Uploading: Subsample for study: 100%|[32m██████████[0m| 1/1 [00:06<00:00,  6.72s/it]


Test datasets have been uploaded to US Stats datasite! (3/4)
Creating Data Scientist account...
Data Scientist account created! (4/4)
You may now begin the test!


---

# 🚩 Begin Mission!

### Part One
With the aim of answering the questions listed in the Mission section, please form your code and then submit it for review. If you get stuck you can use [PySyft's Documentation Site](https://docs.openmined.org/en/latest/getting-started/part3-research-study.html) for assistance.

In [None]:
# Login to the Datasite as a "Data Scientist" user
ds_client = sy.login(email="ds_tester@openmined.org", password="probetatester", port=8080)

In [None]:
# Begin mission to investigate occupation and lifespan


In [None]:
# Add as many cells as you need ^_^

### End Part One 🙌
Great job! Now that you have submitted your code for review, please run the cell below before proceeding. If you run into any issues please notify your moderator.

In [None]:
# Run Data Owner Review Script
re_setup.data_owner_response(port= 8080)


---
### Part Two
The Data Owner has reviewed your code and uploaded the corresponding results. You are now able to review the results and answer the Mission questions in the **Mission Response** section. If you get stuck you can use [PySyft's Documentation Site](https://docs.openmined.org/en/latest/getting-started/part3-research-study.html) for assistance.

In [None]:
# Login to the Datasite as a Data Scientist
ds_client = sy.login(email="ds_tester@openmined.org", password="probetatester", port=8080)

In [None]:
# Add as many cells as you need ^_^

### End Part Two 🙌
Great job! Now that you have gotten the results to your code, please answer the **Mission Response** section below to complete the test. 

#####
---
## ✏ Post-Test Response
### 1. Mission Response

__What were the top 10 occupations with the shortest lifespans?__
- _Answer here_

__What was the average lifespan of the occupation with the shortest lifespan?__
- _Answer here_

__Is the average lifespan of a "Politician" higher or lower than that of the average "Associated Country Life Expectancy"?__
- _Answer here_

### 2. Post-Test Survey

Please **upload your notebook** and tell us about your experience in the [**→→ form here ←←**](https://forms.gle/jjtYQqttPTR55fkr8) to conclude the test.


#####
### 🛑 Shutdown Test Environment
Run the cell below to shutdown the local instance of the test environment.

In [7]:
# The following command will shutdown the local test server
server.land()

Stopping U.S. Stats
