Skip to content

medeffects/tweet_corpora

Repository files navigation

Tweet Corpora

Data Format:

  • Delimiter: Tab
  • Description of Fields:
          1st field: Tweet id
          2nd field: Annotation (Is PET?)

Suggested Downloading Tool:

The original tweets may be obtained using the code snippet like download_tweets.py in http://diego.asu.edu/downloads/publications/ADRMine/download_tweets.zip.

Dietary Supplement Corpus:

Description: Personal experience tweets related to dietary supplements
Dataset (8,770 tweets): https://github.com/medeffects/tweet_corpora/blob/master/SupplementTweetCorpus8770-20160704.csv
Reference: Jiang, K., Calix, R.A., & Gupta, M. (2016). Construction of a Personal Experience Tweet Corpus for Health Surveillance. In Proceedings of the 15th Workshop on Biomedical Natural Language Processing (pp. 128-135). http://www.aclweb.org/anthology/W16-2917

Medication Corpus:

Description: Personal experience tweets related to medications
Dataset (12,331 tweets combined): https://github.com/medeffects/tweet_corpora/blob/master/MedicineCorpusTrainingSet8612-20170501.csv and https://github.com/medeffects/tweet_corpora/blob/master/MedicineCorpusTestSet3719-20170501.csv
Reference: Jiang, K., Feng, S., Calix, R.A., & Gupta, M., Bernard, G.R. (2018). Identifying Tweets of Personal Health Experience through Word Embedding and LTSM. In BMC Bioinformatics 19(Suppl 8):210. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2198-y.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published