No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
models
notebooks
README.md

README.md

semantic space and size

a proof-of-concept for exploring cultural bias on (very) small corpora using cosine similarity and word2vec

note: this repo contains references to offensive language found in alt-right and other subreddits, including assault, gendered insults & profanity. links to subreddits, as well as corpora from subreddits used in this project, may also include offensive material.

for detailed information about this project, please check out this post.

  • find the main code for this project here

  • data for this project, including full subreddits, clean & raw text, vocabulary, & final cosine similarity comparisons are here

citations & docs

this repository makes use of a number of outside libraries and resources. here are a few, with documentation and other information.

libraries & packages

scikit-learn

pandas

numpy

nltk

keras

docker