_
Description:
After you've become familiar with downloading data from NEON Data API, or from other resources on the internet, into your cloud instances, you're going to be in a situation where you need to move them and store them somewhere more permanently.
Its important to accept that many of these public data repositories are stable and that data will be available from them in the future.
This means that you should not create copies of original data unless you are in a situation where the data are very large and downloading them again is prohibitive of your time.
CyVerse Data Store uses a platform called iRODS to manage its data. iRODS has a command line application called iCommands for moving data over the terminal.
First, we need to initiate a connection to the CyVerse iRODS.
1. Launch a terminal in a VICE application
For example Jupyter Lab :
Or RStudio :
2. In the Terminal type in iinit
This should echo out a set of information in the terminal:
3. Enter in the following data for each field:
- host name (DNS):
data.cyverse.org
- port number:
1247
- irods user name:
<your CyVerse username>
- irods zone:
iplant
- current iRODS password:
<your current password>
4. You should now be authenticated to the Data Store.
To test, try typing
ils
If you do not echo back anything, try Step 2. again
4. Type in ils
You should now see the contents of your personal Data Store
5. Upload a single file to the Data Store using iput
Create a new file in RStudio and give it a name, e.g.
test.R
You need to select the file you want to copy, and the location in the Data Store you want to copy it to.
This command will take a single file test.R
and copy it from the container to the Data Store folder /iplant/home/username/ag2pi_workshop/
The flags K
, P
, v
, and f
are described in the help file.
6. Upload a folder with recursive sub-folders and files
Create a new folder in RStudio, and then create a folder inside of it.
Next, we want to upload an entire directory with many folders and files in it.
I have added the flags
b
for bulk, andr
for recursive to theiput
command. This will upload the entire directoryfolder1
to the data store.
7. The P
flag for Progressive and v
flag for verbose will echo out the progress of the upload until it completes.
When it is complete, the terminal should be available again.
To test whether your files are now in CyVerse try:
You should be able to see the contents of your directory in the Data Store
8. These files are now in your private user space. No one can see them, but if you did want to share them, you can do so by modifying their permissions directly in the Discovery Environment, as shown in Step 1, or by using the following commands:
Follow the instructions in the help menu to set the user privileges and ownership.
This example makes your data directory public on the internet as a read-only archive:
It is also likely that you're going to download data from the Data Store into your running Apps
9. Use the ils
command to look for some shared data in the Data Store
10. Download a file using iget
This should download an Rmd file into your local instance (whatever current working directory you're in in terminal)
11. Download a directory using iget
Here we're using the
time
flag to tell us how long the download takes
CyVerse Data Store also uses WebDav, an https based protocol for read-only data downloads from the Data Store.
We can use wget
or curl
commands in the terminal to download files this way.
12. Download a directory using wget
again, we're using the
time
function to monitor the download speeds.We're also using some
wget
flags to just get the data and folders back from the Data Store.
Fix or improve this documentation
- Search for an answer:
- Ask us for help: click on the lower right-hand side of the page
- Report an issue or submit a change:
- Send feedback: learning@CyVerse.org