Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download Wikidata stats for a campaign #4803

Conversation

cyrillefr
Copy link
Contributor

Adresses issue #4345 : Wikidata stats should be download option for an entire campaign

When wiki_education is set to true in application.yml, there is a new link to download wikidata statistics.
There is currently a similar link, but only for courses.
The csv data is an aggregate of the data by courses.

Commit message:

  • Add link to wikidata stats download to UI
  • Back end code (controller + lib)
  • Refactoring CourseWikidataCsvBuilder to use with multiple courses
    in order to also use it with a Campaign.
  • Small refactoring in worker(rubocop)
  • Add locales in en and fr
  • Add route
  • Add tests: units, controllers
  • Add factory for CourseStat

- Add link to wikidata stats download to UI
- Back end code (controller + lib)
- Refactoring CourseWikidataCsvBuilder to use with multiple courses
  in order to also use it with a Campaign.
- Small refactoring in worker(rubocop)
- Add locales in en and fr
- Add route
- Add tests: units, controllers
- Add factory for CourseStat
const wikidataLink = `/campaigns/${campaignSlug}/wikidata.csv`;

let wikidataCsvLink;
if (Features.wikiEd) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to limit this based on Features.wikiEd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that, because there is this limit in the course version (app/assets/javascripts/components/overview/course_stats_download_modal.jsx
So I implied it would be the same for a campaign. I will remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! We should change that limitation on CourseStatsDownloadModal... now it should check for whether the course object has a course_stats property. @1v4n4's project made the wikidata stats data available without the Wiki Ed restriction.

@@ -44,6 +45,11 @@ def revisions_to_csv
CSV.generate { |csv| csv_data.each { |line| csv << line } }
end

def wikidata_to_csv
courses = @campaign.courses.joins(:course_stat)
CourseWikidataCsvBuilder.new(courses).generate_csv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this to follow the same pattern as the other methods like articles_to_csv, building an aggregate CSV by looping through the courses in the campaign, and adding a row of wikidata stats for each course.

Making CourseWikidataCsvBuilder handle the cases of one as well as multiple courses will make this harder to reason about and change, I think.

The problem with that approach is that, unlike the other CSV builders, the output for the Wikidata one isn't a format that is suitable to that aggregation strategy. I think the best way around that will be to change the output format of the Wikidata CSV builder. Currently, it uses headers of revision_type, count with one row per revision type. The alternative would be to have a column for each revision type, so that the campaign version could have one row per course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the join so I do not have to loop, but I can change to the articles_to_csv pattern.
I also will make 2 methods: one CourseWikidataCsvBuilder for courses and another one for campaign.

So, if I understand well, your format proposal for campaign is:
claims created, claims changed, claims removed, etc. course // header
4,2,1, xx,x,"This course name" // lines of data for one course of a campaign
2,4,5, xx, xxx "This other course name" // same

Also, do you want me to change CourseWikidataCsvBuilder so that it will be the same format, but with just one line ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct, except that course name should be the first column.


def sum_wiki_columns(csv_data)
# Skip 1st header row + 1st column course name
csv_data[1..].transpose[1..].map(&:sum).unshift('Total')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever. We don't include a total line in any of the other CSVs, but I'm okay with it.

@ragesoss ragesoss merged commit b20a93a into WikiEducationFoundation:master Mar 7, 2022
@cyrillefr cyrillefr deleted the add_download_option_for_wikidata_stats_to_entire_campagin_4345 branch April 10, 2024 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants