Adds gitlab documentation #70

catenacyber · 2021-12-28T12:55:34Z

For google/oss-fuzz#7073
cc @jonathanmetzman

docs/running-clusterfuzzlite/gitlab.md

securitykernel · 2021-12-31T16:24:32Z

docs/running-clusterfuzzlite/gitlab.md

+{% raw %}
+```yaml
+variables:
+  SANITIZER: "address"


Actually we can make use of matrix builds similar to GitHub, like this, which is quite nice because it avoids job duplication:

parallel: matrix: - SANITIZER: [address, undefined, memory] # customize to your needs

Indeed.
I wanted to have a minimal config for this doc first.
So adding this tip afterwards

I think you should just copy what the github docs do here: https://github.com/google/clusterfuzzlite/blob/main/docs/running-clusterfuzzlite/github_actions.md?plain=1#L61

Also, nit but are the quotes needed here?

nit but are the quotes needed here?

Indeed not

securitykernel · 2021-12-31T16:25:39Z

docs/running-clusterfuzzlite/gitlab.md

+  tags:
+    - fuzz


I would leave these out of the example, they are quite deployment-specific and won't work on gitlab.com shared runners out of the box.

Interesting, I need these on some hosted gitlab.
Changing this in next commit

securitykernel · 2021-12-31T16:28:52Z

docs/running-clusterfuzzlite/gitlab.md

+    - export CFL_CONTAINER_ID=`cut -c9- < /proc/1/cpuset`
+  script:
+    # will build and run the fuzzers
+    - python3 "/opt/oss-fuzz/infra/cifuzz/cifuzz_combined_entrypoint.py"


The Prow integration uses a different container with the combined entrypoint gcr.io/k8s-testimages/clusterfuzzlite:latest. Is it ok to use the build_fuzzers image for running the fuzzers, too, and what is the difference between the build_fuzzers, run_fuzzers and clusterfuzzlite:latest images in this regard? That is maybe more of a question for @jonathanmetzman.

Leaving this to Johnathan

The images are exactly the same (for now) except for their entrypoints.
I'm not sure this is something I want to support and I don't like the fact that this is using build_fuzzers for running fuzzers.
I can publish a cifuzz-combined image that uses the combined entrypoint to support both.
but let me explain why we have different images.
On github, having two images makes it easy to seperate build logs from run logs.
See how in these logs you can click to view the build step and run step seperately.
Now, the drawback of this approach is that I think it has made the interface more complex, instead of having to invoke Cifuzz once it must be invoked twice. See how in this example two steps must be specified: https://google.github.io/clusterfuzzlite/running-clusterfuzzlite/github-actions/#pr-fuzzing

In hindsight, I think it might be a small mistake to have seperated the two actions. So the question is, should we have separate actions on gitlab as well (which will make things consistent at least?)

@oliverchang WDYT of all this?

I see your point now. And I agree that it looks nicer in the logs. In GitLab the log view is different. You would have to specify two jobs per sanitizer, effectively it would look like this (shows 8 jobs):

A click on each job leads to the job log.

Not sure. As I mentioned, I think 2 actions/images makes the interface for users more complicated, but I guess that interface is only used once while the seperation provides benefits that are experienced many times.

Sorry for the late reply. I think we should try to be as consistent as we can and keep the two separate entrypoints for gitlab as well. Does this add significantly more complexity to the gitlab integration?

+1

@catenacyber can you please do this?

Looking into this, I think this adds significant complexity, compared to already integrated CI systems.

We need to share the fuzzers built by clusterfuzzlite-build-fuzzers with clusterfuzzlite-run-fuzzers
As these are 2 different docker images, we need 2 different GitLab jobs. So they do not share any workspace...
To share data between jobs (no matter if they belong to the same pipeline or not), we either need cache or artifacts.

I am trying the solution where clusterfuzzlite-build-fuzzers ends with cp -r $CI_BUILDS_DIR/$CI_JOB_ID/build-out/ $CI_PROJECT_DIR/artifacts/build-out/$SANITIZER and clusterfuzzlite-run-fuzzers begins with mv $CI_PROJECT_DIR/artifacts/build-out/$SANITIZER $CI_BUILDS_DIR/$CI_JOB_ID/build-out
Anyways, the build-outcontents needs to be put in a specific directory under the $CI_PROJECT_DIR aka project_src_path. And we need to differentiate between the different sanitizers builds

This can work also with cache instead of artifacts.
Note that I need to use $CI_BUILDS_DIR/$CI_JOB_ID as gitlab does not know $WORKSPACE
Another solution would be to bring back the workspace in the project directory... But we would still need to differentiate between the sanitizers...

docs/running-clusterfuzzlite/gitlab.md

And the job thus fails

jonathanmetzman

Thanks!

docs/running-clusterfuzzlite/gitlab.md

jonathanmetzman · 2021-12-30T20:58:01Z

docs/running-clusterfuzzlite/gitlab.md

+But then, the `.gitlab-ci.yml` should be different, and explicitly call the `docker` commands
+on clusterfuzz-lite images.
+
+Docker-in-Docker does not seem possible as clusterfuzz-lite images


ClusterFuzzLite

It isn't possible with these images but it is possible in general. The prow image does this.
In anycase though, I'd just get rid of this line, is it even worth mentioning?

@securitykernel managed to get Docker-in-Docker to work.
I did not yet, having a problem with setting DOCKER_HOST to the right value on my setup...

So, removing this, and hoping to add how to setup with Docker-in-Docker

Why do we want docker-in-docker?

I think it's slower.

It's different from how things are done on GH, so it's not the main supported way we have of doing things.

I am not sure we want it.
It is one of the 3 supported modes to use docker in Gitlab CI cf https://docs.gitlab.com/ee/ci/docker/using_docker_build.html

jonathanmetzman · 2022-01-04T16:44:01Z

docs/running-clusterfuzzlite/gitlab.md

+{% raw %}
+```yaml
+variables:
+  SANITIZER: "address"


I think you should just copy what the github docs do here: https://github.com/google/clusterfuzzlite/blob/main/docs/running-clusterfuzzlite/github_actions.md?plain=1#L61

jonathanmetzman · 2022-01-04T16:44:24Z

docs/running-clusterfuzzlite/gitlab.md

+{% raw %}
+```yaml
+variables:
+  SANITIZER: "address"


Also, nit but are the quotes needed here?

docs/running-clusterfuzzlite/gitlab.md

jonathanmetzman · 2022-01-04T17:00:04Z

docs/running-clusterfuzzlite/gitlab.md

+
+## Extra configuration
+
+### Gitlab artifacts filestore


Thanks for going through the trouble of setting this up.
I have some high level comments.

This section seems to offer two options: 1. ??? and 2. using a personal access token.
Can you tell me the tradeoff between the two and we will decide which one to recommend and then we won't even mention the other option.

In the github docs we have screenshots for setting up a personal acccess token (https://google.github.io/clusterfuzzlite/running-clusterfuzzlite/github-actions/#storage-repo). If we go that route, can you add screenshots here?

Can you add screenshots and explanation for downloading artifacts please?

This section makes artifacts seem optional, why should they be optional?

The reason we use the git filestore on github in addition to actions is two-fold: 1. it makes browsing coverage reports on the web possible. 2. It makes it possible for two batch fuzzing jobs to save to the corpus at once (side note: @oliverchang: have we ever tested this? Does batch fuzzing force push so that committing works even if its behind HEAD?). Should we do the same on gitlab?

re parallel pushing: we auto rebase and retry when push fails to make this work.

Can you tell me the tradeoff between the two and we will decide which one to recommend and then we won't even mention the other option.

The downside of the artifacts is that you need to set up an API token
The downside of the cache is that you should ensure that runners share the cache

I think I gather that you want in the end a gitlab filestore that will

push crashes as jobs artifacts

push and pull coverage reports in another git repo

push and pull builds from a cache

push and pull corpus from a cache

Why use cache and artifacts? Why not just one?
Why use git repo for reports? Does gitlab have something like github pages?
Sorry if these are dumb questions/answered elsewhere.

jonathanmetzman · 2022-01-04T17:04:51Z

docs/running-clusterfuzzlite/gitlab.md

+```
+{% endraw %}
+
+You should then define two [schedules](https://docs.gitlab.com/ee/ci/pipelines/schedules.html)


Maybe have some screenshots walk through an example of setting this up.

I can do that once the rest is good for you

jonathanmetzman · 2022-01-04T17:14:46Z

docs/running-clusterfuzzlite/gitlab.md

+    - export CFL_CONTAINER_ID=`cut -c9- < /proc/1/cpuset`
+  script:
+    # will build and run the fuzzers
+    - python3 "/opt/oss-fuzz/infra/cifuzz/cifuzz_combined_entrypoint.py"


The images are exactly the same (for now) except for their entrypoints.
I'm not sure this is something I want to support and I don't like the fact that this is using build_fuzzers for running fuzzers.
I can publish a cifuzz-combined image that uses the combined entrypoint to support both.
but let me explain why we have different images.
On github, having two images makes it easy to seperate build logs from run logs.
See how in these logs you can click to view the build step and run step seperately.
Now, the drawback of this approach is that I think it has made the interface more complex, instead of having to invoke Cifuzz once it must be invoked twice. See how in this example two steps must be specified: https://google.github.io/clusterfuzzlite/running-clusterfuzzlite/github-actions/#pr-fuzzing

In hindsight, I think it might be a small mistake to have seperated the two actions. So the question is, should we have separate actions on gitlab as well (which will make things consistent at least?)

@oliverchang WDYT of all this?

docs/running-clusterfuzzlite/gitlab.md

catenacyber · 2022-01-10T21:01:08Z

@jonathanmetzman you can take a look at GitLab Pages output here :
https://catenacyber.gitlab.io/suricata-cfl/coverage/latest/report/linux/report.html

Is it good to have everything public or should we restrict it to what is in coverage/latest/report/ ?

If everything is good for you, I can make the screenshots

jonathanmetzman

I think this basically lgtm.
Only thing left is:

decide if we want to use one image or two
fix nits
Let clusterfuzzlite integration of gitlab oss-fuzz#7073 land before merging this.

docs/clusterfuzzlite.md

docs/running_clusterfuzzlite.md

jonathanmetzman · 2022-01-11T18:31:08Z

docs/running-clusterfuzzlite/gitlab.md

+    - export CFL_CONTAINER_ID=`cut -c9- < /proc/1/cpuset`
+  script:
+    # will build and run the fuzzers
+    - python3 "/opt/oss-fuzz/infra/cifuzz/cifuzz_combined_entrypoint.py"


Not sure. As I mentioned, I think 2 actions/images makes the interface for users more complicated, but I guess that interface is only used once while the seperation provides benefits that are experienced many times.

docs/running-clusterfuzzlite/gitlab.md

jonathanmetzman · 2022-01-11T21:31:01Z

@jonathanmetzman you can take a look at GitLab Pages output here : https://catenacyber.gitlab.io/suricata-cfl/coverage/latest/report/linux/report.html

Is it good to have everything public or should we restrict it to what is in coverage/latest/report/ ?

If everything is good for you, I can make the screenshots

No need to make it private IMO.
Yes, please make screenshots (keeping in mind that we might split up the action into 2 like on github)

catenacyber · 2022-01-13T08:48:58Z

The screenshots are available here :
http://catenacyber.fr/gitlab-schedule-mode.png
http://catenacyber.fr/gitlab-project-token.png
http://catenacyber.fr/gitlab-variable-token.png

Otherwise nits are fixed, so should be good.
Waiting for your decision about splitting or not building and running...

jonathanmetzman · 2022-01-14T19:22:00Z

I've copied these to our bucket can you replace them with:
https://storage.googleapis.com/clusterfuzzlite-public/images/gitlab-schedule-mode.png
https://storage.googleapis.com/clusterfuzzlite-public/images/gitlab-project-token.png
https://storage.googleapis.com/clusterfuzzlite-public/images/gitlab-variable-token.png

catenacyber · 2022-01-16T20:47:14Z

I've copied these to our bucket can you replace them with:
https://storage.googleapis.com/clusterfuzzlite-public/images/gitlab-schedule-mode.png
https://storage.googleapis.com/clusterfuzzlite-public/images/gitlab-project-token.png
https://storage.googleapis.com/clusterfuzzlite-public/images/gitlab-variable-token.png

The links already pointed to these urls in gitlab.md ;-)

CarpeDiem-CarpeNoctem · 2022-01-26T13:03:20Z

Hi there,

sorry for being late but while I was working with this integration I had a problem concerning the corpus feature.

maybe It should be mentioned in the docs that the folder name of the external corpus git repository must be "corpus/{fuzztargetname}". It has to follow this structure or the gitlab integration will not find the necessary corpus files.

Best regards

jonathanmetzman · 2022-01-26T16:37:12Z

Hi there,

sorry for being late but while I was working with this integration I had a problem concerning the corpus feature.

maybe It should be mentioned in the docs that the folder name of the external corpus git repository must be "corpus/{fuzztargetname}". It has to follow this structure or the gitlab integration will not find the necessary corpus files.

Best regards

Is this for manually adding elements to the corpus?

catenacyber · 2022-01-26T19:37:06Z

Hi there,
sorry for being late but while I was working with this integration I had a problem concerning the corpus feature.
maybe It should be mentioned in the docs that the folder name of the external corpus git repository must be "corpus/{fuzztargetname}". It has to follow this structure or the gitlab integration will not find the necessary corpus files.
Best regards

Is this for manually adding elements to the corpus?

You can also use a seed corpus ad on oss-fuzz with a {fuzztargetname}_seed_corpus.zip file ;-)

catenacyber added 2 commits December 28, 2021 13:54

Adds gitlab documentation

8014cb9

Update doc about gitlab's cache

fa798ce

securitykernel suggested changes Dec 31, 2021

View reviewed changes

catenacyber added 3 commits January 1, 2022 21:15

Doc review update

95ddc8f

Update doc to match latest PR changes

06cad95

upload artifacts when there is a crash

21b8d68

And the job thus fails

jonathanmetzman reviewed Jan 4, 2022

View reviewed changes

catenacyber added 3 commits January 6, 2022 09:21

comments into account

b23eb96

use matrix for sanitizer

5a6a820

filestore with git

ac2d971

jonathanmetzman mentioned this pull request Jan 7, 2022

Is someone working on integrating this into Gitlab's CI? #73

Closed

docs for gitlab pages

d36951a

jonathanmetzman reviewed Jan 11, 2022

View reviewed changes

catenacyber added 2 commits January 13, 2022 09:31

fixup nits

9a30327

Adds screenshots

a1e2d12

fixup docker image names

aaf474d

jonathanmetzman merged commit 08da142 into google:main Jan 26, 2022

jonathanmetzman mentioned this pull request Jan 26, 2022

Fix issues in gitlab docs. #83

Merged

catenacyber mentioned this pull request Sep 16, 2022

Gitlab instructions do not just work. #100

Closed

Adds gitlab documentation #70

Adds gitlab documentation #70

Conversation

catenacyber commented Dec 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathanmetzman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

catenacyber commented Jan 10, 2022

jonathanmetzman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathanmetzman commented Jan 11, 2022 • edited Loading

catenacyber commented Jan 13, 2022

jonathanmetzman commented Jan 14, 2022

catenacyber commented Jan 16, 2022

CarpeDiem-CarpeNoctem commented Jan 26, 2022

jonathanmetzman commented Jan 26, 2022

catenacyber commented Jan 26, 2022

jonathanmetzman commented Jan 11, 2022 •

edited

Loading