Sprint planning and enhancements - ticket-ify this stuff and stick it in sprints #60

nkmeyers · 2020-05-26T18:09:18Z

@ericleasemorgan writes: I am not ignoring y'all. Interesting discussions, and I encourage them to continue.

Concurrently, we need to: 1) get the whole thing running, and 2) then do enhancements. When it comes to Item 1, we have to:

harvest/cache the data set (done) download CORD-19 datasets as well as the corresponding metadata file; select subset of CORD-19, save the metadata in the database, transform the JSON into plain text, and save the plain text to the file system #13
stuff the result into a database (done) store extracted features into the database system #9 create a database node for enhanced Distant Reader backend, and initialize the database #12
enhance the database with additional content (all but done) - what tickets are related to this? and does any other milestone or issue completion depend on it?
index the database (all but done) create an indexer node, and initialize an index #14 loop through the database to create a full text index of the selected subset #15 loop through the database to create a full text index of all of CORD-19 #22 re-create the Solr index #59
make it easy for Team CORD to create study carrels (half done) - what tickets are related to this? and does any other milestone or issue completion depend on it?
make many carrels (barely started) - what tickets are related to this and does any other milestone or issue completion depend on it?
create a Web presence (almost done) create a node for serving HTTP #16, create a Web interface for searching the results #17 any others?

STUFF Below here needs to be ticket-ified or aligned with tickets .
Once we get that far, which I anticipate will be by next Friday(May 29?) , we can go for enhancements, and there are many possibilities:

add long titles to list of carrels
allow people other than Team CORD to create study carrels
create a study carrel out of the whole of CORD, which requires scalability
create better stop word list
enable the whole "library" to be re-created
enhance author names with corresponding ORCHIDs
enhance Web presence with additional logos and attributions
extract additional grammars
figure out a way to dynamically create stop word list
generate additional measures of the documents
hyperlink bibliographic items to full text and other things
illustrate relationships using a network diagram
improve topic modeling
index study carrels
make everything FAIR
plot results on a map
plot results on a time line
refine entity output

As we enhance, we will repeatedly go back to Step #6 and re-build study carrels over and over, thus the carrels will be in a state of "continuous improvement".†

The whole thing is like playing guitar. First you need to learn how hold it. Then you need to learn how to tune it. Then you need to learn a few chords. After that you need to learn how to "keep time". Once you get that far, then you can concentrate to bending notes, advance to finger picking, playing syncopation, experiment with alternative tunings, moving the chords up and down the fret board, improvising, playing in various styles, performing, recording, etc.

We are getting there. I assure you. Please continue to discuss all of these things, and once we get the Reader running, we will prioritize enhancements, divvy up the work, and make the whole something we can be proud of.

† I can't believe I actually used that phrase.

--
Eric M.

Originally posted by @ericleasemorgan in #58 (comment)

molikd · 2020-05-27T18:06:44Z

This ticket is related to #55 going to close #55 in favor of this more succinct ticket

here are tickets for things that you mentioned, where I could find them:

add long titles to list of carrels
- https://cord.distantreader.org/carrels should list titles of carrels not just filenames #30
allow people other than Team CORD to create study carrels
create a study carrel out of the whole of CORD, which requires scalability
create better stop word list
- Customize Stopwords #58
enable the whole "library" to be re-created
enhance author names with corresponding ORCHIDs
enhance Web presence with additional logos and attributions:
- Update the “about” link at the bottom of carrels pages https://carrels.distantreader.org/ to point to the citeus page or the github readme #29 Banner Image and Photo Credit on distantreader.org #25 double check iconography at bottom of distantreader.org #7 Need a Distant Reader logo. #40
extract additional grammars
figure out a way to dynamically create stop word list
- Customize Stopwords #58
generate additional measures of the documents
hyperlink bibliographic items to full text and other things
illustrate relationships using a network diagram
improve topic modeling
- modify the Distant Reader to use domain-specific machine learning models (specifically, scispaCy) #8 Integrate https://spacy.io/universe/project/displacy-ent #27
index study carrels
make everything FAIR
plot results on a map
plot results on a time line
refine entity output

ericleasemorgan · 2020-06-25T19:34:05Z

To the best of my ability, things have been "ticket-ified".

nkmeyers assigned ericleasemorgan May 26, 2020

nkmeyers added this to To do in Project CORD Sprint #1 Due May 29th May 26, 2020

nkmeyers added data This issue is related to data enhancement New feature or request index This issue is related to indexing labels May 26, 2020

nkmeyers added this to the Get the whole thing running milestone May 26, 2020

molikd mentioned this issue May 27, 2020

Selectively disseminate tickets #55

Closed

molikd added this to Triage in The Reader Meets COVID-19 via automation May 27, 2020

molikd added the COVID-19 this issue is top priority because of COVID-19 label May 27, 2020

molikd moved this from Triage to Tasks in The Reader Meets COVID-19 May 29, 2020

molikd moved this from Tasks to In Progress in The Reader Meets COVID-19 May 29, 2020

nkmeyers removed this from To do in Project CORD Sprint #1 Due May 29th May 30, 2020

ericleasemorgan moved this from In Progress to Done in The Reader Meets COVID-19 Jun 25, 2020

ericleasemorgan closed this as completed Jun 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sprint planning and enhancements - ticket-ify this stuff and stick it in sprints #60

Sprint planning and enhancements - ticket-ify this stuff and stick it in sprints #60

nkmeyers commented May 26, 2020 •

edited

molikd commented May 27, 2020 •

edited

ericleasemorgan commented Jun 25, 2020

Sprint planning and enhancements - ticket-ify this stuff and stick it in sprints #60

Sprint planning and enhancements - ticket-ify this stuff and stick it in sprints #60

Comments

nkmeyers commented May 26, 2020 • edited

molikd commented May 27, 2020 • edited

ericleasemorgan commented Jun 25, 2020

nkmeyers commented May 26, 2020 •

edited

molikd commented May 27, 2020 •

edited