# Syft Overview

<h1><center>Syft Overview </h1></center>
<b> PySyft (v0.8.2): Data Scientist Documentation Notebook 1 </b>

<img src="../_images/title_syft_light.png"></img>


Welcome!
PySyft is an <a href="https://github.com/openmined/pysyft">open source library </a> that lets you perform data science on data that is located on someone else's server.

Whether you're a data scientist, machine learning engineer, or looking to audit a large language model (LLM) or its training data, PySyft empowers you to perform your analyses on sensitive data without compromising privacy, security or intellectual property. 

There are public and private entities that succesfully deployed PySyft and enabled external researchers to analyse and learn from private datasets/models. The most recent precedent on advancing algorithmic transparency using PETs is <a href="https://www.christchurchcall.com/media-and-resources/news-and-updates/christchurch-call-initiative-on-algorithmic-outcomes/"> Christchurch Call Initiative on Algorithmic Outcomes (CCIAO) </a>, where multiple external researchers conducted projects on internal production algorithms at Microsoft and DailyMotion using PySyft. A similar project was run in collaboration with  <a href="https://blog.openmined.org/announcing-our-partnership-with-twitter-to-advance-algorithmic-transparency/">Twitter</a>. Additionally,  through the <a href="https://www.economist.com/science-and-technology/the-un-is-testing-technology-that-processes-data-confidentially/21807385">UN PET Lab</a>, multiple national statistics offices around the world are piloting PySyft to enable secure external acess to internal data, but also to enable regional or international collaboration on matters that require joint data access and analysis.

This notebook is the first of many, and will cover the following:
- Motivating Problems
- PySyft
    - Mailbox for Code
    - Tailored API proposal platform
- Levels
- Installation
- Deployment
    
    
The following notebook will pick up from there, and will teach you how to inspect datasets on the domain node, and create code requests.

<hr>

In [None]:
from matplotlib import rcParams, cycler
import matplotlib.pyplot as plt
import numpy as np
plt.ion()

In [None]:
# Fixing random state for reproducibility
np.random.seed(19680801)

N = 10
data = [np.logspace(0, 1, 100) + np.random.randn(100) + ii for ii in range(N)]
data = np.array(data).T
cmap = plt.cm.coolwarm
rcParams['axes.prop_cycle'] = cycler(color=cmap(np.linspace(0, 1, N)))


from matplotlib.lines import Line2D
custom_lines = [Line2D([0], [0], color=cmap(0.), lw=4),
                Line2D([0], [0], color=cmap(.5), lw=4),
                Line2D([0], [0], color=cmap(1.), lw=4)]

fig, ax = plt.subplots(figsize=(10, 5))
lines = ax.plot(data)
ax.legend(custom_lines, ['Cold', 'Medium', 'Hot']);

There is a lot more that you can do with outputs (such as including interactive outputs)
with your book. For more information about this, see [the Jupyter Book documentation](https://jupyterbook.org)

# Helper Funcion

In [None]:
# use helper function
import sys
sys.path.append('../module/')

from helper_functions import add_numbers, calculate_average

numbers = [10, 20, 30]
average = calculate_average(numbers)
print(f"The average is {average}.")

# Example usage
a = 5
b = 7
result = add_numbers(a, b)

print(f"The sum of {a} and {b} is {result}.")
