Based on MStream
conda create python=3.8 -n spec-project
pip install -r requirements.txt
python -m spacy download en_core_web_lg
conda install -c conda-forge lightgbm
brew install libomp
- Add Twitter API keys to
config/twitter.jsonfile.
streamlit run st_tweet_count_api.py
- Reads from Twitter API
- Saves to
data/tweet_objects
streamlit run st_explore_tweet_counts.py
- Reads from
data/tweet_counts& Twitter API - Saves to
data/tweet_objects
streamlit run st_explore_tweet_objects.py
- Reads from
data/tweet_objects
juptyer_notebooks/label_tweet_dataset.ipynb
- Reads from
data/tweet_objects - Writes to
data/labeled_datasets
streamlit run st_featurize_tweets.py
- Reads from
data/tweet_objects
python mstream_pipeline.py <input_filename>
- Reads from
data/labeled_datasets - Writes to
MStream/data,data/embeddings/
python prepare_mstream_data.py <input_filename> <output_filename>
- Reads from
data/labeled_datasets - Writes to
MStream/data
python create_embeddings.py <input_filename> <output_filename>
Prerequisite: Download fasttext embeddings and save them to data/embeddings/fasttext
- Reads from
data/labeled_datasetsanddata/embeddings/fasttext - Writes to
data/embeddings/
cd MStream
./run.sh <input_dataset_name>
- Reads from
MStream/data - Writes to
MStream/data
streamlit run st_main.py
- Reads from
data/tweet_objects - Reads from
MStream/data