Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.5 day workshop in Lucca #27

Open
13 tasks
adswa opened this issue Feb 11, 2020 · 2 comments
Open
13 tasks

1.5 day workshop in Lucca #27

adswa opened this issue Feb 11, 2020 · 2 comments

Comments

@adswa
Copy link
Contributor

adswa commented Feb 11, 2020

@mih and I will be giving a workshop on DataLad in Lucca on March 23rd-24th. This issue lists the TODOs and acts as a progress tracker.
Please extend and edit as necessary. :)

Logistics

  • await Feedback from Lucca on dates
  • await Feedback from Lucca on GDrive account
  • figure out travel
    • @adswa (I will likely take a train. Depending on when we plan to arrive, there is a nice one overnight, arriving at 7 something in the morning) EDIT: both of us will go to Pisa from Montreal
    • @mih

Software

  • write a custom wrapper around a special remote for gdrive.

Teaching

A Basics layout has been proposed by @mih and awaits feedback from Lucca

  • Datalad concepts and principles

  • Basics of local data/code version control

    • Hands on: tasks to exercise basic building blocks
  • Modular data management for reproducible science

    • Hands on: implement sketch of a reproducible paper
  • Data management for collaborative science

    • Hands on: Using your infrastructure (Gdrive) to collaborate on a
      demo project
  • Data publication

    • Hands on: Publish data on "GitHub"
  • Outlook (what is else possible, resources, use cases)

  • Potential group work: Small sets of people are given problems to solve with DataLad and present

This is currently structured like this:
Monday 23 Morning session
1 Datalad concepts and principles
2 Basics of local data/code version control + Hands on: tasks to exercise basic building blocks

Monday 23 Afternoon session
1 Modular data management for reproducible science + Hands on: implement sketch of a reproducible paper
2 Data management for collaborative science + Hands on: Using your infrastructure (Gdrive) to collaborate on a demo project

Tuesday 24 Morning session
1 Data publication + Hands on: Publish data on "GitHub"
2 Outlook (what is else possible, resources, use cases)

Resources to create

  • rclone GDrive wrapper (started here TMP/NF: Draft a wrapper around rclone datalad/datalad#4162)
  • slides
  • code lists
  • sketches of a LaTeX (?) skeleton for a reproducible paper. @adswa could potentially use resources she will help to improve at the Turing Way book dash.
  • Data to use for examples and to publish to Gdrive
  • Optional/Wishlist: Some sort of audience response system. EduVote (Browser-based, Google Forms, ...? E.g., in the form of: "How confident are you using --> rating scale"
  • Workshop feedback (potentially pre-post, to learn about attendees expectations before and after the course, knowledge gain. Also remember to collect Feedback on DataLad
@adswa
Copy link
Contributor Author

adswa commented Feb 14, 2020

I have created a free GDrive account for testing (dataladtester@gmail.com). This gives us 15GB to play with.

@adswa
Copy link
Contributor Author

adswa commented Feb 14, 2020

rclone for GDrive notes:

When you use rclone with Google drive in its default configuration you are using rclone’s client_id. This is shared between all the rclone users. There is a global rate limit on the number of queries per second that each client_id can do set by Google. rclone already has a high quota and I will continue to make sure it is high enough by contacting Google.

It is strongly recommended to use your own client ID as the default rclone ID is heavily used. If you have multiple services running, it is recommended to use an API key for each service. The default Google quota is 10 transactions per second so it is recommended to stay under that number as if you use more than that, it will cause rclone to rate limit and make things slower.

Here is how to create your own Google Drive client ID for rclone:

Log into the Google API Console with your Google account. It doesn’t matter what Google account you use. (It need not be the same account as the Google Drive you want to access)

Select a project or create a new project.

Under “ENABLE APIS AND SERVICES” search for “Drive”, and enable the “Google Drive API”.

Click “Credentials” in the left-side panel (not “Create credentials”, which opens the wizard), then “Create credentials”, then “OAuth client ID”. It will prompt you to set the OAuth consent screen product name, if you haven’t set one already.

Choose an application type of “other”, and click “Create”. (the default name is fine)

It will show you a client ID and client secret. Use these values in rclone config to add a new remote or edit an existing remote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant