# Download and Upload with OSF

This tutorial shows how to download and upload cryo-EM datasets using the `datasets` module from `ioSPI`, that interact with the [Open Science Foundation (OSF)](https://osf.io/) framework.

OSF is an initiative that aims to increase the openness, reproducibility and integrity of scientific research. Among other functionalities, it is possible to upload scientific data which can be accessed by an Application Programming Interface (API). 

``ioSPI`` offers functionalities that allow uploading and accessing cryo-EM data using:
- either, in order to get started: using the class `Project` that leverages the package `osfclient`
- or, if the user requires finer control: using the class `OSFUpload` which follows `OSF APIv2`. 

This tutorial introduces both options.

# Set-up

First, you will need to get setup with osf.

- Create an account on https://osf.io/ and save the email address you use.
- On this account, create a personal token in [Settings](https://osf.io/settings/tokens) and save it.

The email address and the token will be needed to connect to different OSF projects.

We import the `datasets` module from `ioSPI`:

In [2]:
from ioSPI.ioSPI import datasets

# Getting Started

## Configure your credentials to access the OSF Project

Find the OSF project from which you wish to download your data. 

In this tutorial, we use a project called "cryoEM simulated" which contains simulated images from the 80s human ribosome. This project is on osf at the url: "https://osf.io/7g42j/".

- Save the ID of the project of interest, which appears in the project's url.

In our case, the project ID is `7g42k`.

- Create an object from the class `Project` using:
  - your credentials from the set up: email address and token,
  - the project ID that you just saved.

In [3]:
cryoem_simulated_project = datasets.Project(
    username="ninamio78@gmail.com", 
    token="HBGGBOJcLYQfadEKIOyXJiLTum3ydXK4nGP3KmbkYUeBuYkZma9LPBSYennQn92gjP2NHn",
    project_id="7g42j")

OSF config written to .osfcli.config!


You have successfully set up the configuration of the OSF project!

## List Files on the OSF Project

Now you can list the files available on this OSF project. Note that this code can take a few minutes to run.

In [4]:
cryoem_simulated_project.ls()

Listing files from OSF project: 7g42j...
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy7_defocus1.0_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy1_defocus1.5_yes_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy4_defocus2.0_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy8_defocus0.5_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy0_defocus0.5_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy0_defocus3.0_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy1_defocus2.5_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy7_defocus3.0_no_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy0_defocus2.0_yes_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy4_defocus2.0_yes_noise.h5
osfstorage/randomrot1D_nodisorder/final/4v6x_randomrot_copy4_defocus2.5_no_noise.h5
osfstorage/randomrot1D_nodisorde

osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy6_defocus3.0_no_noise.log
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy2_defocus3.0_yes_noise.inp
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy9_defocus1.5_yes_noise.txt
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy1_defocus1.0_no_noise.h5
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy6_defocus0.5_no_noise.h5
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy1_defocus3.0_no_noise.log
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy1_defocus1.5_no_noise.inp
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy1_defocus1.5_no_noise.mrc
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy1_defocus0.5_no_noise.log
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy6_defocus1.0_yes_noise.log
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy4_defocus1.0_no_noise.inp
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy8_defocus2.5_no_noise.log
osfstorage/randomrot1D_nodisorder/4v6x_randomrot_co

osfstorage/randomrot_nodisorder/4v6x_randomrot_copy14_defocus1.0_yes_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy14_defocus0.5_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus3.0_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus2.5_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus2.0_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus1.5_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus1.0_yes_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus1.0_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy13_defocus0.5_yes_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy12_defocus3.0_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy12_defocus2.5_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy12_defocus2.0_no_noise.h5
osfstorage/randomrot_nodisorder/4v6x_randomrot_copy12_defocus1.0_yes_nois

We observe that this project contains many files, organized in different folders.

 ## Download Files from the OSF Project

We can download one of these files, e.g. choosing from the above list the following txt file:

- `osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy0_defocus3.0_yes_noise.txt`.


In [5]:
cryoem_simulated_project.download(
    remote_path="osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy0_defocus3.0_yes_noise.txt", 
    local_path="4v6x_randomrot_copy0_defocus3.0_yes_noise.txt")

Downloading osfstorage/randomrot1D_nodisorder/4v6x_randomrot_copy0_defocus3.0_yes_noise.txt to 4v6x_randomrot_copy0_defocus3.0_yes_noise.txt...
Done!


  0%|          | 0.00/4.22k [00:00<?, ?bytes/s]100%|██████████| 4.22k/4.22k [00:00<00:00, 9.90Mbytes/s]


## Upload Files to an OSF Project

Importantly, OSF will not let you upload data to any folder: authorization is requested.

To test this functionality, you can create a new project through osf.io (https://osf.io/myprojects/) by clicking: `Create project`.

This will create a new project page, as the one we are using here.
- Save the project ID of the project you just created!

You should then create a new `my_project` object of the class `datasets.Project` with the new project ID.

For the purpose of this tutorial, however, we will stay with our original project cryoEM simulated and use our object `cryoem_simulated_project`.

We re-upload the file that we just downloaded, renaming it by adding a `new_version` prefix to its name.

In [6]:
cryoem_simulated_project.upload(
    remote_path="osfstorage/randomrot1D_nodisorder/new_version_4v6x_randomrot_copy0_defocus3.0_yes_noise.txt", 
    local_path="4v6x_randomrot_copy0_defocus3.0_yes_noise.txt")

Uploading 4v6x_randomrot_copy0_defocus3.0_yes_noise.txt to osfstorage/randomrot1D_nodisorder/new_version_4v6x_randomrot_copy0_defocus3.0_yes_noise.txt...
Done!


Congratulations! You have successfully uploaded and downloaded data from OSF.