Skip to content

Commit

Permalink
added cookbook
Browse files Browse the repository at this point in the history
  • Loading branch information
Jithin James committed Mar 4, 2023
1 parent 19a9339 commit b141e50
Show file tree
Hide file tree
Showing 4 changed files with 810 additions and 101 deletions.
1 change: 1 addition & 0 deletions data/datasets/nsfw_csam_reddit/.gitignore
@@ -1,2 +1,3 @@
dataframes/*.csv
comments_cache/*.csv
*.csv
6 changes: 6 additions & 0 deletions data/datasets/nsfw_csam_reddit/README.md
@@ -0,0 +1,6 @@
# NSFW - CSAM from Reddit

Note(TODO): this is the pipeline, will need to scale this dataset by getting
data from file.pushshift.io
The scripts and notebooks in the directory are used to create the NSFW and CSAM
dataset from Reddit that can be used to train the safety model.

0 comments on commit b141e50

Please sign in to comment.