# Welcome

Please install a few Python dependencies via pip before you begin.

```
pip3 install matplotlib pandas pysnyk
```

### What do we want to achieve in the workshop

- Learn how `pysnyk` works roughly
- Navigate the basic `pysnyk` API
- Do management tasks:
  - find projects
  - change some fields for these projects
- Do reporting tasks
  - recreate 1 chart from dashboard
  - create a donut chart from something
- Encourage the SE team to work on these sheets to cover more usecases.

In [1]:
import snyk
import pandas as pd

Let's get connected. We start by creating a SnykClient instance. This is the Python interface to the Snyk API.

For the string, create a service account in your Snyk backend and paste the token here.

Create a `SnykClient` instance and pass it a token, like from a Service Account. This is created in the Snyk backend. The constructor also takes other parameters for the URL (useful to test the EU instance). See details at https://pypi.org/project/pysnyk/.

**Note this project by default uses the V1 API**. It is possible to point it to V3 and get a raw HTTP Client, but there will be easier ways for that in the future using a generated OpenAPI client.

In [2]:
client = snyk.SnykClient("5d778343-78e7-4c73-9ba6-55af47a2f6af")

We fetch all organisations and print them in a table. _Ignore the `_client` column_.

In [3]:
orgs = client.organizations.all()

pd.DataFrame(orgs)

Unnamed: 0,name,id,slug,url,group,client
0,GL Test,4d0ac6a1-9e83-40de-b6ec-df7a723fe532,gl-test,https://app.snyk.io/org/gl-test,"{'name': 'E Corp Group', 'id': 'a6214dc6-8e1c-...",<snyk.client.SnykClient object at 0x1340a6e50>
1,.Net World,c6d37704-5daa-41ad-9ca8-33a5b55b4872,.net-world,https://app.snyk.io/org/.net-world,"{'name': 'E Corp Group', 'id': 'a6214dc6-8e1c-...",<snyk.client.SnykClient object at 0x1060e7130>
2,E Corp,47d53b9a-81ad-49c0-8cb5-73f814bbc1fd,e-corp,https://app.snyk.io/org/e-corp,"{'name': 'E Corp Group', 'id': 'a6214dc6-8e1c-...",<snyk.client.SnykClient object at 0x1060e7100>
3,WAD 2022,0df00f3d-feb3-41ce-88ce-a1a44d8b7f8b,wad-2022,https://app.snyk.io/org/wad-2022,"{'name': 'E Corp Group', 'id': 'a6214dc6-8e1c-...",<snyk.client.SnykClient object at 0x1340e1040>


Let's have a quick excursion into `pandas`. The `DataFrame` object actually has methods to learn about it, like `.columns`. 

In [5]:
pd.DataFrame(orgs).columns

Index(['name', 'id', 'slug', 'url', 'group', '_client'], dtype='object')

With that knowledge, let's only extract 2 columns and read.

In [6]:
pd.DataFrame(orgs)[['id', 'name']]

Unnamed: 0,id,name
0,4d0ac6a1-9e83-40de-b6ec-df7a723fe532,GL Test
1,c6d37704-5daa-41ad-9ca8-33a5b55b4872,.Net World
2,47d53b9a-81ad-49c0-8cb5-73f814bbc1fd,E Corp
3,0df00f3d-feb3-41ce-88ce-a1a44d8b7f8b,WAD 2022


## Finding a organisation 

Next up, let's look at a particular organisation by it's slug. Note the `[0]` at the end, indicating we could b ematching multiple orgs but only check the first one. You can extend the condition return multiple.

In [7]:
org = [org for org in orgs if org.slug == 'e-corp'][0]

In [8]:
print("Found org: \nID: {}\nName: {}\n".format(org.id, org.name))

Found org: 
ID: 47d53b9a-81ad-49c0-8cb5-73f814bbc1fd
Name: E Corp



### Analysing the projects within a organisation

Now that we have the organisation, let's look at the projects within. 

We saw earlier how to extract only certain columns, this will be done here as well.

Start by fetching all projects and store them in the `projects` variable.

In [24]:
# Fetch all projects.
projects = org.projects.all()

We can run queries on `projects` as well. For example, let's look at type of projects we scan first.

Again, we can use a `DataFrame` ability to drill into a column and use the `unique` method to find the types we have.

In [28]:
pd.DataFrame(projects)['type'].unique()

array(['maven', 'nuget', 'deb', 'dockerfile', 'npm', 'cocoapods',
       'gradle', 'yarn', 'rpm', 'pip', 'cpp', 'gomodules', 'k8sconfig',
       'sast', 'terraformconfig'], dtype=object)

Let's fetch some details about the projects themselves.

In [25]:
# Maybe we are interested in a particular type only?
# projects = [p for p in projects if p.type == 'maven']

# Show the data frame.
pd.DataFrame(projects)[['name', 'type', 'totalDependencies']]

Unnamed: 0,name,type,totalDependencies
0,sebsnyk/java-reachability-playground(master):p...,maven,14.0
1,sebsnyk/MvnSpringBootTest(main):pom.xml,maven,18.0
2,sebsnyk/WebGoat(develop):webgoat-lessons/passw...,maven,116.0
3,sebsnyk/WebGoat(develop):webwolf/pom.xml,maven,106.0
4,sebsnyk/juliet-test-suite-csharp(main):src/tes...,nuget,0.0
...,...,...,...
308,sebsnyk/terraform-goof(master):modules/subnet/...,terraformconfig,
309,sebsnyk/terraform-goof(master):modules/storage...,terraformconfig,
310,sebsnyk/terraform-goof(master):modules/vpc/out...,terraformconfig,
311,sebsnyk.juice-shop:k8s-src/juice-shop-deploy.yaml,k8sconfig,


### Quick Pandas Math

Pandas supports various methods like `.mean()` or `.max()` on a column. We can also use `.describe()` to see a calculation done on a particular column like `totalDependencies`.

In [15]:
frame = pd.DataFrame(projects)
frame['totalDependencies'].describe()

count     258.000000
mean      100.217054
std       237.032955
min         0.000000
25%         0.000000
50%         0.000000
75%       116.000000
max      1243.000000
Name: totalDependencies, dtype: float64

## Usecase: How many projects per type?

It can be interesting to know how many projects exist per particular type. In pandas, this can be done with `groupby`. The interesting bit is how Pandas understanding of the table automatically sums up relevant fields. `totalDependencies` is accumulated while `isMonitored` is counted.

In [32]:
pd.DataFrame(projects).groupby('type').sum()

Unnamed: 0_level_0,readOnly,totalDependencies,isMonitored
type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
cocoapods,0,72.0,3
cpp,0,15.0,1
deb,0,1269.0,3
dockerfile,0,3587.0,28
gomodules,0,31.0,1
gradle,0,29.0,4
k8sconfig,0,0.0,15
maven,0,4691.0,62
npm,0,15017.0,18
nuget,0,352.0,133


In [12]:
projects[0].tags.add('hello', 'world')
projects[0].tags.add('hello', 'world2')

True

In [13]:
projects = org.projects.all()
projects[0].tags.all()

[{'key': 'hello', 'value': 'world2'}, {'key': 'hello', 'value': 'world'}]

In [14]:
x = [tag for tag in projects[0].tags.all() if tag['key'] == 'hello']
x

[{'key': 'hello', 'value': 'world2'}, {'key': 'hello', 'value': 'world'}]

In [15]:
tag = projects[0].tags.get('hello')
if tag != None:
    projects[0].tags.delete('hello', 'world')
projects[0].tags.all()

AttributeError: 'dict' object has no attribute 'id'