DSO Project

Rui Liu edited this page Feb 25, 2014 · 45 revisions
Clone this wiki locally

Initial Commit

This version is based on the prototype branch of helloqa. For helloqa, we made the following changes:

  • Fixed dependency errors
  • Fixed SQL errors
  • Fixed platform errors

For DSO project, we convert the components developed in summer'10 to CSE's phases. More specifically, we have converted the following components:

Those components are extended from the following abstract classes:

  • edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractAnswerTypeExtractor
  • edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractICEventExtractor
  • edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractKeytermExtractor
  • edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractPassageRetrieval
  • edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractInformationExtractor
  • edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractAnswerGenerator

The JCas objects are handled by the following classes:

  • edu.cmu.lti.oaqa.openqa.dso.framework.jcas.AnswerTypeJCasManipulator
  • edu.cmu.lti.oaqa.openqa.dso.framework.jcas.ICEventJCasManipulator
  • edu.cmu.lti.oaqa.openqa.dso.framework.jcas.KeytermJCasManipulator
  • edu.cmu.lti.oaqa.openqa.dso.framework.jcas.DocumentJCasManipulator
  • edu.cmu.lti.oaqa.openqa.dso.framework.jcas.AnsJCasManipulator

Updated Aug-19-2013

To run the system, we have to make sure that the following files exist under helloqa:

Updated Aug-20-2013

Fixed all the dependency issues by converting jar files to maven dependencies.

The following steps are required to run the system:

  • Get an account to access the nexus server http://mu.lti.cs.cmu.edu:8081/nexus/index.html#welcome
  • ask Zi, Avner, or Rui to create one
  • Configure internal Maven repository, if you are using Linux (other platforms (e.g. windows) may have problems):
    • You can find your maven files here:
      • cd ~/.m2
    • If no .m2 folder exists:
      • mkdir -p ~/.m2
    • Create a file settings.xml underUpdated Aug-29-2013 .m2
      • ask Zi, Avner, or Rui to email you the file
    • Open eclipse, click tab ‘Windows->Preferences->Maven->User Settings’, click button update settings and apply.
    • It should be working now.

Updated Aug-29-2013

Evaluation is ready! Baseline System is ready!

An initial testing results for the DSO framework:

  • Question: Who is the mastermind of World Trade Center bombing?
  • Answer Key: Ramzi Yousef
  • Phase 1: AnswerTypeExtractor
    • Input: Who is the mastermind of World Trade Center bombing?
    • Output: NEproperName->NEperson->NEterrorist
  • Phase 2: KeytermExtractor
    • Input: Who is the mastermind of World Trade Center bombing?
    • Output: [mastermind, World Trade Center, bombing]
  • Phase 3: ICEventExtractor
    • Input: Who is the mastermind of World Trade Center bombing?
    • Output: combined-27409
  • Phase 4: PassageRetrieval
    • Input: Question, Keyterms and AnswerType
    • Output: doc size = 152
  • Phase 5: InformationExtractor
    • Input: Question, Keyterms, AnswerType and Retrieved Docs
    • Output: NER size = 307, [Ramzi Yousef, Jemaah Islamiyah, Abu Sayyaf, ... ]
  • Phase 6: AnswerGenerator
    • Input: Answer Candidates
    • Output: Ranked Answers: [Ramzi Yousef and co-conspirators, Ramzi Yousef, Khalid Sheikh Mohammed, ... ]

An initial testing result for a single question:

  • Who is the mastermind of World Trade Center bombing?
    • Reciprocal rank: 0.5
    • Accuracy: 0.0
    • Binary recall: 1.0
    • EVALUATION REPORT Experiment: 8e9c8b9d-cd2e-43b2-ad0e-e84b741c4b48:1 Evaluator,Configuration,DocumentMAP,PassageMAP,AspectMAP,Count PassageMAPMeasuresEvaluator, 1|AnswerTypeExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 2|KeytermExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 3|ICEventExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 4|PassageRetrieval[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 5|InformationExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 6|AnswerGenerator[persistence-provider:inherit: ecd.default-log-persistence-provider ], 0.5000,0.0000,1.0000,1