-
Notifications
You must be signed in to change notification settings - Fork 5
Setup a stats and quality plots in the README #291
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
@BobBorges I now updated the issue to be more clear what I think we should include in the README. |
Excellent! Thanks for the direction. |
It seems like github markdown doesn't support any type of variable substitution or file transclusion, so the following strategies would not be possible in the README. variable substitution (django template)
or File transclusion (latex)
Do any of you have suggestions about automating dynamic updates to the readme? @MansMeg @ninpnin Plots are not an issue -- these can be added / updated the same way as the speaker mapping accuracy plot as part of each release cycle. Text / tables are more troublesome. Two options that come to mind:
I'm not aware of a markdown parser that could handle reading an md file and updating targeted fragments (like etree for xml or orgparse for .org files). But maybe something like this exists. I started working with strategy (2), but wanted to put the issue up for discussion before getting too deep into it. |
In R, I can do this with knitr and the kable() function. Then it computes the thing automatically; see here: But I'm not sure how to do this in the best way in Python. @ninpnin probably know this better than me. But maybe the Markdown library? It looks mature to me, but unsure if it solves the problem. |
We want to be able to follow the quality of the corpus and reasonable stats for each new release of the corpus. This is commonly used by people doing research and should be easy to update. Also, old numbers for previous releases should be stored as now with the quality plot. We should probably store and plot the stats both in total and by year since many researchers will cut out some years that are of relevance to them).
Hence, ideally, we would have one stats dashboard and one quality dashboard. Then, we could link to these figures from the project's homepage.
We should add the following plots to the README:
Corpus information
(This should probably just be a table).
Corpus Statistics - Figures
Corpus Quality
The text was updated successfully, but these errors were encountered: