Skip to content

Release new version of system to master branch (by May 4)

Closed May 7, 2014 100% complete

TODOs for the next code push

Must fix before pushing

Documentation

  • ID convention: force developers create "id bigint" column for variable tables, but not to use the column
  • inference rules convention: check and update
  • extractors (Done in http://deepdive.stanford.edu/doc/extractors.html; needs review)
  • New configurations supported
    • skip_learning
    • weight_table

TODOs for the next code push

Must fix before pushing

Documentation

  • ID convention: force developers create "id bigint" column for variable tables, but not to use the column
  • inference rules convention: check and update
  • extractors (Done in http://deepdive.stanford.edu/doc/extractors.html; needs review)
  • New configurations supported
    • skip_learning
    • weight_table
    • relearn_from

Known issues in code

  • default extractors (udf_extractor) still assigns "id" to output JSON.
  • Greenplum parallel load / unload is not implemented yet in tsv_extractor

Test, documentation, and code review about following components:

  • New extractor path 1: plpy_extractor
  • New extractor path 2: tsv_extractor
  • Extractor path 3: sql_extractor
  • Extractor path 4: cmd_extractor
  • Grounding

Test to make sure all examples work

  • attention to OCR example that has 2 variable tables
  • spouse_example contains 3 implementations with different extractor frameworks.

Optional

  • Go through whole website

  • More test for plpy_extractor

    • Test extreme cases for input queries
    • test extreme cases for UDFs
  • Write a debugger for plpy_extractor

  • Adding new unit tests

    • for all pipeline configurations
    • for all extractors
    • checking before and after script for all extractors
    • checking extreme input_batch_size for tsv_extractor
    • disable output_batch_size for tsv_extractor

This milestone is closed.

No open issues remain. View closed issues or see open milestones in this repository.