New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download Wikidata stats for a campaign #4803
Download Wikidata stats for a campaign #4803
Conversation
- Add link to wikidata stats download to UI - Back end code (controller + lib) - Refactoring CourseWikidataCsvBuilder to use with multiple courses in order to also use it with a Campaign. - Small refactoring in worker(rubocop) - Add locales in en and fr - Add route - Add tests: units, controllers - Add factory for CourseStat
const wikidataLink = `/campaigns/${campaignSlug}/wikidata.csv`; | ||
|
||
let wikidataCsvLink; | ||
if (Features.wikiEd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to limit this based on Features.wikiEd
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that, because there is this limit in the course version (app/assets/javascripts/components/overview/course_stats_download_modal.jsx
So I implied it would be the same for a campaign. I will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! We should change that limitation on CourseStatsDownloadModal... now it should check for whether the course
object has a course_stats
property. @1v4n4's project made the wikidata stats data available without the Wiki Ed restriction.
@@ -44,6 +45,11 @@ def revisions_to_csv | |||
CSV.generate { |csv| csv_data.each { |line| csv << line } } | |||
end | |||
|
|||
def wikidata_to_csv | |||
courses = @campaign.courses.joins(:course_stat) | |||
CourseWikidataCsvBuilder.new(courses).generate_csv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer this to follow the same pattern as the other methods like articles_to_csv
, building an aggregate CSV by looping through the courses in the campaign, and adding a row of wikidata stats for each course.
Making CourseWikidataCsvBuilder
handle the cases of one as well as multiple courses will make this harder to reason about and change, I think.
The problem with that approach is that, unlike the other CSV builders, the output for the Wikidata one isn't a format that is suitable to that aggregation strategy. I think the best way around that will be to change the output format of the Wikidata CSV builder. Currently, it uses headers of revision_type, count
with one row per revision type. The alternative would be to have a column for each revision type, so that the campaign version could have one row per course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with the join so I do not have to loop, but I can change to the articles_to_csv
pattern.
I also will make 2 methods: one CourseWikidataCsvBuilder
for courses and another one for campaign.
So, if I understand well, your format proposal for campaign is:
claims created, claims changed, claims removed, etc. course // header
4,2,1, xx,x,"This course name" // lines of data for one course of a campaign
2,4,5, xx, xxx "This other course name" // same
Also, do you want me to change CourseWikidataCsvBuilder
so that it will be the same format, but with just one line ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct, except that course name
should be the first column.
|
||
def sum_wiki_columns(csv_data) | ||
# Skip 1st header row + 1st column course name | ||
csv_data[1..].transpose[1..].map(&:sum).unshift('Total') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clever. We don't include a total
line in any of the other CSVs, but I'm okay with it.
Adresses issue #4345 : Wikidata stats should be download option for an entire campaign
When wiki_education is set to true in application.yml, there is a new link to download wikidata statistics.
There is currently a similar link, but only for courses.
The csv data is an aggregate of the data by courses.
Commit message:
in order to also use it with a Campaign.