Skip to content

Meeting Minutes Mar 20 18

Gregory Gould edited this page Apr 5, 2018 · 3 revisions

2018-03-20

Location Time Duration
CSC-262 12:00 - 14:00 2 hrs

Team Meeting

  • PDF Viewing

    • Austin is implementing
  • Updates on PDF Scraping

    • Cecilia and Julienne are going to step in and help Greg with PDF scraping. The JSON format was too nested and required tailoring as well as a different method of injecting non-review documents.
  • PDF Scraping

    • Switched to Apache Tika over PDF Miner, was able to preserve Layout.
  • Discussion on serving PDFs

    • Julienne and Cecilia still discussed how PDFs were going to be served in the system; concluded that more research should be done on what could be done
  • NLP

    • NLP will be applied to queries alone with separate tool. There is no available training data for governmental queries that would be useful in applying. This engine is also a PDF server, not a personal assistant.
  • Client emailed to arrange meeting to show progress

  • Focusing on searching with real data, implementing Advanced Search and Filtering capabilities, connecting visualization with data, and serving PDFs/attachments

Clone this wiki locally