Skip to content

Releases: CornellNLP/ConvoKit

ConvoKit Version 4.1.1

01 May 16:56
ccfc8e8

Choose a tag to compare

We’re excited to release ConvoKit 4.1.1, which adds support for Gemini API keys in the GenAI module.

What’s New

Added support for Gemini API keys in the GenAI module: #347

ConvoKit Version 4.1.0

10 Mar 14:28
eedbb5f

Choose a tag to compare

We're excited to release ConvoKit 4.1.0, featuring a revamped convokit website, new WikiConv multilingual datasets and five other datasets, and added examples for recently released features.

1. ConvoKit Website Revamp

We have completely redesigned the main ConvoKit website to better accommodate the ever growing list of datasets and features available. The new site includes functionality like searching for datasets and features by tags and light/dark mode support.

2. Multilingual WikiConv

We have added datasets from the German, Russian, Chinese and Greek versions of Wikipedia Talk Pages, in addition to the existing English WikiConv dataset. For more information, check out the corpuses on the WikiConv page, which contains updated download instructions.

3. Additional Datasets

Thanks to the contribution of students from CS 6742, we also have five new datasets covering a wide range of domains and conversation dynamics. More information on these corpuses can be found on our website and documentation site.

4. New Examples

We have added new examples for Redirection and Utterance Likelihood, Talk Time Sharing Dynamics, Summary of Conversation Dynamics (SCD), ConDynS, and Pivotal Moments to our examples page. Check them out here.

ConvoKit Version 4.0.0

03 Nov 14:20
3a60019

Choose a tag to compare

We're excited to release ConvoKit 4.0.0, featuring major enhancements that bring LLM-powered analysis capabilities to conversational data processing.

1. GenAI Module: LLM Prompt Transformer

The new GenAI module introduces LLMPromptTransformer, a flexible transformer that enables users to apply LLM prompts for arbitrary tasks with seamless integration between conversational elements. With the new module, it is easy to:

  • Apply prompts at multiple levels: utterances, conversations, speakers, or entire corpus
  • Support for multiple LLM providers (OpenAI GPT, Google Gemini, local models)
  • Having unified configuration management for API keys and model settings
  • Store LLM responses directly as metadata on corpus objects

For more information, check GenAI module guide.

2. Summary of Conversation Dynamics (SCD)

We introduce SCD Transformer, a transformer that generates structured summaries of conversational dynamics using the LLM Prompt Transformer. For more detailed information, check page SCD.

3. Conversation Dynamics Similarity (ConDynS)

We present ConDynS, a similarity measure for comparing conversations with respect to their dynamics, as introduced in "A Similarity Measure for Comparing Conversational Dynamics". For more information, check page on ConDynS

ConvoKit 3.5.0

15 Oct 19:19
5ec4530

Choose a tag to compare

We are excited to release ConvoKit version 3.5.0, which introduces the new TalkTimeSharing Transformer, a module for analyzing how talk-time evolves between speakers throughout a conversation—capturing both overall balance and moment-to-moment dynamics. The release includes a demo applying the method to different conversational datasets. Please see PR #276 for details!

ConvoKit Version 3.4.0 and 3.4.1

15 Aug 20:30
3718c27

Choose a tag to compare

We’re excited to announce ConvoKit 3.4.0, which includes:

  • New TransformerDecoderModel and TransformerEncoderModel classes, enabling the use of LLMs for forecasting tasks.
  • An expanded CGA-CMV corpus, now containing 19,578 conversations and 116,793 comments.

Version 3.4.1 addresses issues reported by @kakeith, including MacOS installation problems caused by unsupported dependency packages and a download error for the FOMC Corpus. This update introduces:

  • A new optional installation target for LLM-related packages required by certain transformers (currently Forecaster, Redirection, Pivotal Moments, and Utterance Simulator). These can be installed via: pip install 'convokit[llm]'.

We appreciate the efforts of all contributors in making this release, and thank @kakeith for raising the issue that helped us improve!

ConvoKit Version 3.3.0

04 Jul 08:58
bb2f75f

Choose a tag to compare

We are excited to release version 3.3.0 where we updates on the Supreme Court Corpus with merged and cleaned utterance data covering 1955–2023, incorporating newly added transcripts from 2019–2023. Extensive validation and manual checks were performed to ensure the dataset's integrity. For full details, see the pull request.

ConvoKit Version 3.2.0

02 Jun 03:20
f7c213c

Choose a tag to compare

We are excited to announce the release of ConvoKit 3.2.0! This version introduces a framework for identifying pivotal moments in conversations. We also release a demonstration of the framework via this notebook. For more information, please check the PR for the new features.

ConvoKit Version 3.1.0

31 Dec 02:43
7a7e9f6

Choose a tag to compare

We are excited to announce the release of ConvoKit 3.1.0! This version introduces a framework for measuring redirection in conversation flow, as described in this paper. We also release a demonstration of the framework on Supreme Court oral arguments via Google Colab. In addition to redirection, we provide a generalized transformer for annotating utterance-level likelihoods given a defined conversation context. For more information, check the PR for the new features #250.

ConvoKit Version 3.0.2

28 Dec 05:03
af8adcd

Choose a tag to compare

We are excited to release ConvoKit 3.0.2! This minor update resolves installation issues related to older versions of SciPy by updating the package dependency to require a more recent version. We found Google Colab, with its pre-loaded packages at runtime, may still result in errors. This can be resolved by restarting the session and re-running the code blocks to ensure the correct package versions are imported. For more details, please refer to our Troubleshooting page and pull request #257.

ConvoKit Version 3.0.1

20 Nov 06:10
a506040

Choose a tag to compare

We are excited to announce the release of ConvoKit 3.0.1, which focuses on bug fixes, adding new datasets, and dependency upgrades. Key updates include:

  • Fixed issue with ConvoKit's download method that prevented datasets from being downloaded to the configured directory.
  • Fixed the support for downloading non-corpus objects
  • Updated the conversational forecasting transformer to make it more flexible
  • Added five new datasets, with documentation available on our website and documentation site.
  • Addressed compatibility issues related to Numpy by building against Numpy 2.0+ and upgrading dependency packages accordingly.

We address some potential issues on our Troubleshooting page, especially with Numpy. If you encounter any issues, feel free to join our Discord community for more support, or submit an issue on GitHub. Thank you!

Notice that we no longer support Python 3.8 (EOL) and 3.9 (not supported by Numpy 2.0.0+).

You can refer to the following pull requests for more details:

  • Fixing bugs:

    • [1] Fixing ConvoKit download method #225 #217
    • [2] New Forecaster Framework #217
  • New datasets:

    • [1] CANDOR corpus #201
    • [2] DeliData corpus #238
    • [3] FORA corpus #238
    • [4] NPR-2P corpus #238
    • [5] FOMC corpus #238
  • Dependency packages:

    • [1] Building ConvoKit to work with Numpy 2.0.0+ #229 #251 #247

Contributors:

  • Kaixiang Zhang (Sean)
  • Ethan Xia
  • Yash Chatha
  • Laerdon Yah-Sung Kim
  • Jonathan P. Chang