Skip to content

Meeting Minutes

AustinGrey edited this page Jan 26, 2018 · 27 revisions

Group Meetings

2018-01-17

Location Time Duration
CSC-262 12:00 - 14:00 2hr

General Discussion

  • Method of communication: Discord
  • Designated extra meeting time: Thursdays, 15:20 - 17:00
  • Outlined plans for Sprint 1

Questions for Marion, Ann, Meg, and Barbosa

  • Are we altering the existing website or making a new one? Tools for the website?
  • Is this 4 separate tools? Do you envision a set of back-end/front-end?
  • Knowledge graphs: Accessible to public/private only? Different types for different users?
  • What will data dump look like? (How regular is the data, when will it be available, request examples)
  • Will it be their cybera account we work on?
  • What languages do current tools use?

Brainstorming

Data Ingestion

  • NoSQL DB? (fire base, mongo, couch, SOLR)

Analysis

  • NLP (SOLR?, other libraries)

Search

  • What type of queries? (More examples)

Visualization

  • What graph does the client want?

Sprint 1 Planning

  • Requirements (use cases, user stories, prepare a template, prioritize them, number them)
  • Nagivation/UI (Nav diagram, story board/screens, amount of detail needed?)
  • High Level Design (Draw.io/lucid charts)
  • Release Planning (Create page on wiki - professional so that client can view, only 4 sprints)
  • Project Overview (Glossary, list of competitors?)
  • Meetings (Minutes, Times)

2018-01-19

Location Time Duration
SAB-307 13:00 - 14:00 1hr

General discussion notes

  • Data is not yet available, should be by the time the day is out
  • The governing board is made of several committees, each having members from around the university, business focused
  • The academic board handles more academic matters, has members from each faculty. They produce the course calendar, code of conduct, staffing of classes, student focused.
  • We will be handling data from both, the board of govenors and general faculties council
  • Nominating, replenishing committee and awarding committees are NOT shared with us, they are confidential. Every other committee is public.
  • Each meeting has the following process
    • Agenda
    • Meeting documentation
    • Eventually goes into a corp record
    • The members share information with eachother
    • Then presentations
    • Then action taking
    • Then decisions/actions posted
  • Important docs:
    • Agenda listings
    • Approved minutes
    • Meeting materials
  • Members try to find the meeting materials to prepare themselves
  • Community members might be interested in a specific item of the agenda, which has associated meeting materials.
  • “As a community member I want to see all references to an agenda item”
  • Internally using sharepoint and sitecore
  • Supports business proc and sharing of data. The outputs of which are what are being put on the website
  • Sitecore handles the website
  • Right now SITECORE searching isn’t intuitive, not robust. You need to find a specific area and then search from there
  • SHAREPOINT has too much not-essential data.
  • They can currently search on agenda titles
  • But they WANT to be able to search on topics
  • Not all the topics will be called the same thing
  • Goal: Make a searchable PDF database.
  • On sitecore2 platform right now, working on an update.
  • Two user types: members, public
  • The goal, connect users to pdfs that match their queries
  • Quereies:
  • I want to know what is happening with the peter lougheed colledge
  • What commitees talked about it?
  • What aspects did they talk about?
  • How many committees does eleni sit on
  • How many decisions were made on this topic?
  • Start with committee members? Expand later to the public?
  • Ann: “We were looking up something that had the word ethics in it, we were all searching for this thing as long as it had ethics, but the word was ethical, and then too many results came up in all these different sections and we had to divide and conquer to try and find this document”
  • Must be portable, mobile friendly, web-based
  • “What happened to do with ethical in 2017?”
  • “Which committees touched the ethical topic”
  • “Which decisions were about ethical topics”
  • “If i wanted information on how the dean of science was appointed, what committes talked about that appointment?”
  • “Why did they change my degree name? Who decided this? Who voted on it?”
  • “What’s being talked about related to tuition right now?”
  • THEY WANT A QUERY LANGUAGE, something that can be very specific, but they need natural language for when they aren’t exactly sure what they are looking for
  • “Maybe it’s better to just chuck the PDFs right back into sharepoint?”
  • “We need your expert analysis”
  • “If it could be a search function”
  • “A webtool, some predefined functions?”

2018-01-26

Location Time Duration
CSC-262 12:00 - 14:00 2hr

General Discussion

  • Questions to answer Eleni's emails
  • Do you have a good idea on what you will be delivering -- A search page, that has a well defined search syntax, very minimalist design -- Should have some predefined questions with associated pre-made searches that people will find interesting -- Visualizations we are not sure on yet -- We take excel files, parse into database, deliver database queries using the SPARQL language

Diego

  • We need a github repo
  • Need record of all meetings
  • THere is a reasrearch component of this project -- You need to be able to take text and extract entities out of them to place in the database also -- Eg. "Bill Gates founded Microsoft", extract bill gates, microsoft, founding relationship
  • How do you think the system will look -- Our definition given -- Its not very clear - WE NEED TO SCHEDULE A MEETING WITH ELENI
  • What technologies will you be using -- Denilson is going to tell you which libraries to use to extract the entities
  • Repo -- Need use cases (Tentative assignment: Julienne and Chris) -- Mockups and navigation diagram (tentative assignment: Austin) -- 2 UML diagram, component, class, plus the high level diagram (tentative assignment: Cecilia) --- The diagrams need more descriptions and notes than normal so Diego understands -- A GANTT chart, with people assigned to the task -- Glossary, List of similar products, Description (tentative assignment: Vuk) -- MUST BE USING GITHUB issues

Eleni

  • Sharepoint data is transformed into PDF

  • There is extra information attached onto the PDFs after they are generated

  • Take note of the UML diagram that was sent, it describes the excel dump we were sent

  • First step: bring up PDFs that mention a phrase/keyword

  • Second step: bring up only the relevant information from each PDF, what it is describing, and what other sections it links to, some sort of visualization

  • Third step: Take text, analyze syntax structure, auto label everything semantically (nouns, relations: eg. Stroulia said "blah"). Job is to take the text from the PDF's, take all the people we have from the knowledge graph from the spreadsheets, and then extract knowledge from the PDF's.

  • We could potentially solve this basically with SOLR, find the appropriate PDFs to scan

  • SPARQL is used to scan triple ("Eleni", is, "Human"), which is a storage of relationships

  • We have 2 options

  • We either go relational database, or we go SPARQL style

  • SPARQL + knowledge graphs -- We need to get information from the excel, build knowledge graphs by defining all the data as triples -- Then use NLP to extract more information and add more triples

  • RElational

  • Take spreadsheets -> SQL

  • Use SOLR to index text, get information that matches the diagram of entities

  • Then do NLP to get more information and add more data

  • Recommended d3.js that will implement many visualizations

Clone this wiki locally