Meeting Minutes Jan 29 18

2018-01-29

Location	Time	Duration
ATH-452	3:30 - 4:00	30min.

Discussion with Eleni

Expectation: More from project than just simple search with results
Clarification:
- 'People' is a very nice and understandable quantity
- However, 'Discussions with USRI evaluations' -> should also detect 'teaching evaluations', 'teaching evaluations reform', 'quality of instruction reform'
WANT: n-grams (phrases); names from spreadsheets
- There are other keywords hidden in text (n-grams)
- Have to analyze source text

Diagram

Minutes are ‘probably’ in the spreadsheets
Agenda and minutes are 1:1; minutes and attachments are 1:n
‘Key phrases’ come from item, from attachments, from minutes
For each item => agenda(i), minutes(i-1), attachments(i)
Look for OUTLINE OF ISSUE --> (next) OUTLINE OF ISSUE
- Between OUTLINE OF ISSUE should be 'PDF blob' we return as search result
Take attachments and cross-reference with items in agenda
Example: Question gets submitted a week before => secretary does something, etc., something that was developed outside => get submitted to president (attachments are coming from all different places)
e.g., Stroulia + USRIs + APC => find all USRIs in APC where Stroulia was present
Either:
- We can find them as words: NLTK (NLP)
- Entity recognition, extract entities trained to recognize more interesting key phrases: IBM Watson, Stanford
Search PDFs, parsing PDFs is something risky
Elastic Search -> free text, parses everything, index
NLTK vs. Elastic Search -> Algorithm for counting words vs. indexing

Suggestions

Start with elastic search (find boxes with text), then once you have boxes, find attachments (pieces of text found, need to know where they’re coming from)
Visualization: ‘I found this 10 times; found 6 in December, found 4 times in January.’
HIGHLIGHT BOXES WITH SOME REFERENCE OR PLACE IN PICTURE
Every item will have a 'blob' (all text associated with item from attachments => will have description)
Spend more time on Visualization and UI
Filters!!! ie., tell everything with question attached to it, show what happened in APC, GFC; where did it get elaborated, where did it get slowed down
AGENDA, MINUTES, ATTACHMENTS => technical user story “As a PDF parser…. [find key phrases], [calculate distinctiveness]”
Keyphrase = something that is sufficiently frequent but not smoothly distributed in all documents (covered by NLTK)
NLTK -> lots of resources
Somebody has to develop data ingestion, somebody else does elastic search, somebody else does user interface, etc.
“PROBLEMS”:
- PDFs
To consider:
- Find: “Stroulia” said something
- Next: USRI scenario (get teaching quality, etc.)

The Next Level

Knowledge Graphs
Get keyphrases and some point in time do further analysis and figure out ‘USRIs’ is a type of teaching evaluation.
‘University’ is about ‘teaching’
Extract meaningful phrases
In knowledge graph, ‘teaching evaluation’, ‘USRIs’ figures out which teaching evaluations it is talking about

Q & A

Where are these PDFs?
- PDFs are available on the web and we have to scrape them (suggested: wget)
Can we get access to SharePoint?
- Sharepoint access is no
When system is being used, still getting from website?
- Our system is a prototype to demonstrate ; we don’t care where it comes from, just feed it PDFs

TransparentGov Wiki

Records
- Meeting Minutes
Sprint 1
- Requirements Document
  - Project Overview
  - Project Glossary
  - Storyboarding
  - User Stories/Use Cases
    - User Stories
  - Technical Resources
  - Similar Products
- Software-Design Document
- Project Management
  - Project Roles
  - Release Planning
    - Gantt Chart
Grades
Test Documentation
Client Documentation
- User Manual
- Operation Guide
Not Yet Developed
- Presentation
- Screencast

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meeting Minutes Jan 29 18

2018-01-29

Discussion with Eleni

Diagram

Suggestions

The Next Level

Q & A

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TransparentGov Wiki

Clone this wiki locally