# Welcome

Please install a few Python dependencies via pip before you begin.

```
pip3 install matplotlib pandas pysnyk
```

### What do we want to achieve in the workshop

- Learn how `pysnyk` works roughly
- Navigate the basic `pysnyk` API
- Do management tasks:
  - find projects
  - change some fields for these projects
- Do reporting tasks
  - recreate 1 chart from dashboard
  - create a donut chart from something
- Encourage the SE team to work on these sheets to cover more usecases.

In [None]:
import snyk
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

plt.rcParams['figure.dpi'] = 300
plt.rcParams['savefig.dpi'] = 300

%config InlineBackend.figure_format = 'retina'
%config InlineBackend.dpi = 300

Let's get connected. We start by creating a SnykClient instance. This is the Python interface to the Snyk API.

For the string, create a service account in your Snyk backend and paste the token here.

Create a `SnykClient` instance and pass it a token, like from a Service Account. This is created in the Snyk backend. The constructor also takes other parameters for the URL (useful to test the EU instance). See details at https://pypi.org/project/pysnyk/.

**Note this project by default uses the V1 API**. It is possible to point it to V3 and get a raw HTTP Client, but there will be easier ways for that in the future using a generated OpenAPI client.

In [None]:
client = snyk.SnykClient("5d778343-78e7-4c73-9ba6-55af47a2f6af")

We fetch all organisations and print them in a table. _Ignore the `_client` column_.

In [None]:
orgs = client.organizations.all()

pd.DataFrame(orgs)

Let's have a quick excursion into `pandas`. The `DataFrame` object actually has methods to learn about it, like `.columns`. 

In [None]:
pd.DataFrame(orgs).columns

With that knowledge, let's only extract 2 columns and read.

In [None]:
pd.DataFrame(orgs)[['id', 'name']]

## Finding a organisation 

Next up, let's look at a particular organisation by it's slug. Note the `[0]` at the end, indicating we could b ematching multiple orgs but only check the first one. You can extend the condition return multiple.

In [None]:
org = [org for org in orgs if org.slug == 'e-corp'][0]

In [None]:
print("Found org with ID: {} & name: {}".format(org.id, org.name))

### Analysing the projects within a organisation

Now that we have the organisation, let's look at the projects within. 

We saw earlier how to extract only certain columns, this will be done here as well.

Start by fetching all projects and store them in the `projects` variable. Also, let's print out the list of columns we can inquire.

In [None]:
# Fetch all projects. Notice this one takes a bit of time because a actual query to the API is made.
projects = org.projects.all()
projectsFrame = pd.DataFrame(projects)

# Let's inspect the columns first.
print(projectsFrame.columns)

### What types of projects are scanning in this organisation?

We can run queries on `projects` as well. For example, let's look at type of projects we scan in Snyk first.

Again, we can use a `DataFrame` ability to drill into a column and use the `unique` method to find the types we have.

In [None]:
projectsFrame['type'].unique()

Interesting, so there are quite a few project types currently scanned. This may be different depending on your organisation.

Let's fetch some details about the projects themselves. How about the name, type and the amount of dependencies in this project.

In [None]:
# Maybe we are interested in a particular type only?
# projects = [p for p in projects if p.type == 'maven']

# Show the data frame.
projectsFrame[['name', 'type', 'totalDependencies']]

That is a lot of rows (in my case). How can we filter this quickly?


### Filter out a project type

We may be only interested in SAST projets for once, let's filter this one.

In [None]:
projectsFrame[projectsFrame.type == 'npm'][['name', 'type', 'totalDependencies']]

### Quick Pandas Math

Let's learn about the total dependencies as declared in the column above.

Pandas supports various methods like `.mean()` or `.max()` on a column. We can also use `.describe()` to see all of these calculations done.

In [None]:
projectsFrame['totalDependencies'].describe()

## Usecase: How many dependencies per type?

It can be interesting to know how many dependencies are pulled in per particular type. In pandas, this can be done with `groupby`. The interesting bit is how Pandas understanding of the table automatically sums up relevant fields. `totalDependencies` is accumulated while `isMonitored` is counted.

We use [DataFrame.sum](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sum.html) sum up (or count) the data based on a column name passed.

Also, [DataFrame.count](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.count.html) is available.

In [None]:
projectsFrame.groupby('type').sum()['totalDependencies']

These kind of tables can also be built into two dimensions.

In [None]:
projectsFrame.groupby(['origin', 'type']).count()['id']

### Focus one one column only

While it makes sense to get the overview for all columns, let's focus on just the number of dependencies.

In [None]:
projectsFrame.groupby('type')['totalDependencies'].sum().plot()

A nicer way to see this is by using a bar chart. This is quite easy with pandas and matplot.

In [None]:
projectsFrame.groupby('type')['totalDependencies'].sum().plot.bar();

## Management tasks (add & remove tags)

You can use `pysnyk` in for other tasks as well.

In [None]:
TAG_NAME='business-unit'
TAG_VALUE='sir-christopher-wren'

# let's focus on a single project
project = client.projects.get(projects[0].id)

# check if we have the tags stored first - Snyks API is quite strict.
if { 'key': TAG_NAME, 'value': TAG_VALUE} in project.tags.all():
    project.tags.delete(TAG_NAME, TAG_VALUE)

In [None]:
project.tags.add(TAG_NAME, TAG_VALUE)

In [None]:
# notice how this is still empty!
print('Tags: {}'.format(project.tags.all()))

# reload it and try again
project = client.projects.get(projects[0].id)
      
print('Tags: {}'.format(project.tags.all()))