Skip to content
discourse analysis for DBpedia chatbot: http://chat.dbpedia.org/
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
clustering
data
.gitignore
LICENSE
README.md
analysis.ipynb
data_exploration.ipynb
dependency_parsing.ipynb
requirements.txt

README.md

DBpedia-Chatlog-Analysis

Discourse analysis for DBpedia chatbot: http://chat.dbpedia.org/

Description of notebooks:

  1. data_exploration.ipynb houses code for grouping chats w.r.t. user_id and for preliminary analysis, such as, finding average length of conversation and number of users.

  2. In analysis.ipynb, we find -

    • the most used channel (web/slack/facebook messenger)
    • no. of failed responses per conversation and no. of questions that did not satisfy users
    • Conversation length after a negative feedback
    • character length of user-requests
    • perform NER and find commonly asked topics
    • if coreferences exist
    • the language of user-requests
  3. Use dependency_parsing.ipynb to get the estimate of the number of complex questions asked and to prepare input (candidate pairs) for intent clustering.

  4. The clustering folder contains 2 implementations (KMeans and HDBSCAN) for finding the latent-intents in utterance representations. Use get_sentence_embeddings.ipynb, preferably on Google Colab, to fetch sentence embeddings for clustering user-requests based on their semantics.

You can’t perform that action at this time.