Skip to content

Peking University Semantic Computation and Knowledge Retrieval Course Project

Notifications You must be signed in to change notification settings

Sara-HY/SCKR-Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SCKR-Course

Peking University Semantic Computation and Knowledge Retrieval Course Project

WordSimilarity

Calculate the word similarity using different methods

  • data

    MTURK-771.csv: Data set with the groud truth.

    text8: The corpus from Wikipedia for training the Word2vec.(Please add glove.6B.300d.txt in dataset youself.)

  • result

    corpus: The result and model of word2vec.

    web_search: The results of jaccard, overalap, pmi and dice.

    wordnet: The results of path, wup, lch, res, lin and jcn.

  • word_similarity.py: Codes.

  • report.pdf: The report of Course Project.

SentimentClassification

Tweet Sentiment Classification (SemEval2017 Task 4 Subtask A)

  • data

    glove.6B.300d:Pre-trained word vectors (dimension = 300). (Please add glove.6B.300d.txt in dataset youself.)

    twitter-2016train-A/twitter-2016dev-A/twitter-2016test-A: Tweets, divided into training set, valid set and test set.

  • code: Codes.

  • report.pdf: The report of Course Project.

DBQA

Document-based Question Answering task (DBQA)

  • data

    hanlp-wiki-vec-zh.txt: Pre-trained word vectors (dimension = 300). (Please add hanlp-wiki-vec-zh.txt in dataset youself.)

    stop_words.txt: Chinese stop words.

    nlpcc-iccpol-2016.dbqa.training-data/nlpcc-iccpol-2016.dbqa.testing-data/test.txt: NLPCC2017DBQA data, divided into training set, valid set and test set.

  • code

    DBQA_CNN&Attention1: CNN with Attention Model 1.

    DBQA_CNN&Attention2: CNN with Attention Model 2.

  • report.pdf: The report of Course Project.

About

Peking University Semantic Computation and Knowledge Retrieval Course Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages