# OSDF Examples

What if we didn't want to download the data and the software container at the start? Maybe: 

- the data is big and we don't want to store it locally
- we're part of a project and want everyone to be using the same central container

It turns out, we can just fetch this data directly from OSDF! 

In [None]:
cd ~/tutorial-fastqc

## Exploring the Data

The data and a sif file are hosted at this OSDF "bucket"

In [None]:
OSDF_LOCATION="osdf:///ospool/uc-shared/public/osg-training/tutorial-fastqc"

We can use the pelican client to view what files are available: 

In [None]:
pelican object ls ${OSDF_LOCATION}

In [None]:
pelican object ls ${OSDF_LOCATION}/data

In [None]:
pelican object ls ${OSDF_LOCATION}/sif

We could use `pelican object get` to fetch any of the objects to explore them locally, but instead, let's use them in jobs. 

## Using Objects from OSDF in Jobs

Using the data and .sif file from OSDF is as simple as adding the OSDF URL to the submit file as shown here: 

In [None]:
cat alt-submit/osdf-fastqc.submit

Note: instead of writing out the whole URL wherever we need it, we're using an intermediate variable, `$(OSDF_LOCATION)`

In [None]:
condor_submit alt-submit/osdf-fastqc.submit

In [None]:
condor_q

## Multiple Jobs

To run multiple jobs, we could use Pelican to generate the list of samples: 

In [None]:
pelican object ls -L ${OSDF_LOCATION}/data | cut -d '/' -f 8 | cut -d . -f 1 >> alt-submit/samples.txt

And then edit the submit file: 
* Change the queue statement to iterate through the list of samples: 
    
    `queue sample from alt-submit/samples.txt`
* replace all references to a specific sample file with the variable from the queue statement
    
    `transfer_input_files = $(OSDF_LOCATION)/data/$(sample).trim.sub.fastq`