Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Backend Support for Make Data Count use and citation metrics #4821

Closed
adam3smith opened this issue Jul 10, 2018 · 51 comments

Comments

@adam3smith
Copy link

commented Jul 10, 2018

Following up on the google group discussion here: https://groups.google.com/forum/#!topic/dataverse-community/rQWNllAyTu0

Dataverse should support and display Make Data Count (https://makedatacount.org/) - standardized usage metrics

Slides and QA from recent (July 2018) webinar here:
https://makedatacount.org/presentations/

Here are detailed guidelines for implementation:
https://github.com/CDLUC3/Make-Data-Count/blob/master/getting-started.md

Here are steps to implement from the project from an earlier presentation

@mfenner

This comment has been minimized.

Copy link

commented Oct 18, 2018

Feel free to ask any questions DataCite staff can help with here.

@pdurbin

This comment has been minimized.

Copy link
Member

commented Oct 30, 2018

@djbrooke

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2018

We'll determine how these metrics appear on the page as #3404 moves through our design process, but there's an opportunity to get the backend pieces in place. Some proposed steps for discussion and estimation:

  • Sending the logs to DataCite (and any processing we need to do beforehand)
  • Receiving data back from DataCite
  • Storing the data that's sent back
  • Getting the information onto the page (but not displaying it)

This will position us well for implementation once we have the designs further along and validated.

@djbrooke djbrooke changed the title Implement Make Data Count use and citation metrics Implement Backend Support for Make Data Count use and citation metrics Oct 31, 2018

@djbrooke djbrooke assigned djbrooke and unassigned djbrooke Oct 31, 2018

@pdurbin

This comment has been minimized.

Copy link
Member

commented Nov 13, 2018

I met @mbjones at Whole Tale Workshop on Tools and Approaches for Publishing Reproducible Research and he mentioned he'd be happy to field technical questions we have about DataONE's implementation of Make Data Count.

Meanwhile, DataONE put out a blog post at https://www.dataone.org/news/new-usage-metrics that has some nice screenshots of a dataset at https://search.dataone.org/view/doi:10.5063/F1Z899CZ which I'll put below:

dataone_implements_new_usage_and_citation_metrics_to_make_your_data_count_dataone_-_2018-11-13_14 47 21

@mbjones

This comment has been minimized.

Copy link

commented Nov 15, 2018

Happy to help, @pdurbin. The time series graphs you cited were made much faster by caching results locally and then enabling group by at various levels of aggregation. The d3-charts we build and other visualizations are all part of our open source MetacatUI data portal frontend, so you might find some of that reusable.

pdurbin added a commit that referenced this issue Nov 19, 2018

@pdurbin

This comment has been minimized.

Copy link
Member

commented Nov 19, 2018

@mbjones thanks. Is there any reusable Java we might interested in as well?

All, at standup today I said I was close to pushing some docs that capture my understanding of what we're trying to implement. These docs are in 4dd10bd but I'll add them as a screenshot below as well. I also stubbed out some API tests but nothing has been implemented yet. It's all just stubs. Feedback is welcome.

make_data_count_ dataverse org-_2018-11-19_16 31 47

@pdurbin

This comment has been minimized.

Copy link
Member

commented Nov 20, 2018

Here's a to do list of tasks that are top of mind for me.

  • Read https://github.com/CDLUC3/Make-Data-Count/blob/master/getting-started.md
  • Read "COUNTER Code of Practice for Research Data": https://doi.org/10.7287/peerj.preprints.26505v1
  • Decide if we will be parsing logs that we generate or extending our guestbook feature to record views as well. Or other approaches. Discuss. Updated: On 2018-11-27 we decided to parse logs rather that writing each view and download to the database but we didn't consider multiple Glassfish servers and may need to think some more about this.
  • Decide how we will store the metrics in Dataverse. Which database tables? Should the JSON be cached? Update: on standup on 2018-12-03 I explained that the DataONE interface shows more than just a number for views, for example. It shows a time series chart of views per month. Is this what we want?
  • Ask @mbjones is there is any re-usable Java code from DataONE's Make Data Count implementation. Could also ask in https://dataoneorg.slack.com which I joined. Update: No Java code. See below.
  • Decide if we can make use of https://github.com/CDLUC3/counter-processor and follow up on decision at CDLUC3/Make-Data-Count#99 . Update: Apache logs cannot simply be parsed as-is as explained in CDLUC3/counter-processor#3 . Dataverse must emit logs in a particular format to make use of counter-processor. On 2018-11-27 we decided that we probably won't use counter-processor because it introduces a dependency and because it doesn't "just work" with our logs: CDLUC3/Make-Data-Count#99 (comment)
  • Is there value in me (and/or others on the Dataverse team) joining https://www.rd-alliance.org/groups/data-usage-metrics-wg ? Is there a mailing list with public archives? The most recent item under "Recent Activity" is a blog post from June. Update: A "pdurbin" account applied for on 2018-12-03 and the response back was "Your request will be approved by RDA Secretariat and your account activated within 1 business day." 2018-12-07 update, still word on a "pdurbin" account and we were told in the meeting by Martin that there are no implementation details in there.
  • Question: Can Dataverse express data citations? Can "Related Publications" be used? Update: yes, Dataverse can express citations but "Related Dataset" should be used. See 17cbf37
  • If Dataverse can express data citations, can the DataCite hub receive them? In 4dd10bd I only talk about sending views/investigations and downloads/requests. Update: Yes, DataCite can receive data citations (make sense, I guess 😄 ). See 17cbf37 and discussion below.
  • Is "DataCite hub" the right name for the service that Dataverse installations will be sending data to? Update: "DataCite hub" is what's shown at https://makedatacount.org/roadmap/ so we'll go with that.
  • Make sure @mheppler @TaniaSchlatter @dlmurphy and @jggautier know that there is some potentially reusable front end code for #5253 from DataONE as @mbjones indicated above. A good starting point may be NCEAS/metacatui#594 . Update: discussed at a standup before Thanksgiving.

I also wanted to note that I set up a Jenkins job to build the guides from the branch I'm using to http://guides.dataverse.org/en/4821-make-data-count/admin/make-data-count.html

I asked the Dataverse community for feedback at https://groups.google.com/d/msg/dataverse-community/rQWNllAyTu0/RMD0GEFzAgAJ

@djbrooke djbrooke self-assigned this Nov 20, 2018

matthew-a-dunlap added a commit that referenced this issue Feb 28, 2019

@pdurbin pdurbin removed their assignment Feb 28, 2019

pdurbin added a commit that referenced this issue Feb 28, 2019

matthew-a-dunlap added a commit that referenced this issue Mar 1, 2019

@kcondon kcondon assigned kcondon and unassigned matthew-a-dunlap Mar 1, 2019

matthew-a-dunlap added a commit that referenced this issue Mar 1, 2019

matthew-a-dunlap added a commit that referenced this issue Mar 6, 2019

MDC Missed basic dataset json api #4821
The fun never ends!

@kcondon kcondon removed their assignment Mar 8, 2019

@sekmiller sekmiller assigned sekmiller and unassigned sekmiller Mar 8, 2019

@matthew-a-dunlap

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2019

We are waiting on a new api token to complete testing of this story.

@matthew-a-dunlap

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2019

Turns it out was a dev box issue, not an api token (datacite/sashimi#56) . I was able to submit to the test box with our current api token, so this is unblocked.

@matthew-a-dunlap matthew-a-dunlap removed their assignment Mar 15, 2019

@kcondon kcondon self-assigned this Mar 15, 2019

@kcondon kcondon closed this in 78cb180 Mar 15, 2019

kcondon added a commit that referenced this issue Mar 15, 2019

Merge pull request #5329 from IQSS/4821-make-data-count
Make Data Count support: backend #4821

@kcondon kcondon removed the Status: QA label Mar 15, 2019

@djbrooke djbrooke added this to the 4.12 milestone Mar 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.