# Saving Work

## Downloading a Project or Data

The Jupyter supports downloading one file at a time. You can not select multiple files and download all of them at the same time. To download multiple files, they must be archived into a single file and optionally compressed for efficient downloading.

### Archiving and Compressing 

GeoLab uses a Linux based image. It includes common Linux such as `tar` and `gzip`. To save your work use `tar` to create an archive of multiple files and `gzip` to compress the archive for downloading.

If we want to download all the notebooks, use tar to create an archive.

```
tar cvf notebooks.tar *.ipynb
```

>**Explainer:**
>
> tar - command
> cvf - options for `c` create, `v` verbose (show the files added to the archive), `f` force
> notebooks.tar - the name of the archive file
> *.ipynb - the files to add to the archive

If we want to download data, for example a directory of miniseed files, the first step is to create an archive with tar and compress it with gzip for more efficient downloading.

```
tar czf miniseed.tar.gz miniseed_data
```

>**Explainer:**
>
> tar - command
> czf - options for `c` create, `z` gzip archive, `f` force
> miniseed.tar.gz - named of the compressed archive
> miniseed_data - name of the directory
>
> Note that the archive is compressed with gzip using the `z` option

### Downloading 

To download files from GeoLab, select the file in the Jupyter dashboard (panel that lists the directory contents).

![](./images/jupyter_dashboard.png)

From the Menu bar, select `File` > `Download`

![](./images/file_download.png)


## Saving to GitHub

You can save a project to GitHub if you have an account. There are two ways to save a project to GitHub or a git provider, forking or cloning.

Use **cloning** when you have write access to a repository or are working on your own projects, it creates a local copy that you can directly push changes. This is the standard workflow for team members collaborating on shared repositories or working on personal projects. 

Use **forking** when you want to contribute to someone else's repository and don't have write permissions. In open source projects, forking creates an independent copy on GitHub that you control, allowing you to make changes freely and propose those changes to the original repository through pull requests. 

In practice, you'll fork repositories to contribute to projects you don't own, then clone your fork to your local machine to actually work on it—so forking and cloning often work together rather than being alternatives to each other.

### Forking

Forking creates a complete copy of a repository on GitHub/GitLab under your account. It creates a new repository on GitHub. The fork is indepent but linked to the original. This means you have full control over your fork and can push changes to your fork. If you want to submit changes to the orginal repository, you can submit a pull request. The key advantage to forking is that the repository is created for you and can be cloned to your computer.

```
# 1. Fork on GitHub (click "Fork" button)

# 2. Clone YOUR fork to your computer
git clone https://github.com/YOUR-USERNAME/project.git
cd project

# 3. Add the original repo as "upstream"
git remote add upstream https://github.com/original-owner/project.git

# 4. Make changes
git add .
git commit -m "Fix bug"

# 5. Push to YOUR fork
git push origin main
```

>**Explainer:**
>
>In the first step, use the Fork button to create a copy if the repository to your GitHub account.
>
>The following step is to `clone` the forked repository to your computer where you can make changes.
>
>Adding the original repo as `upstream` (the source) will allow you to receive updates via a `pull` request
>
>Updating your repository follows the standard commads of `git add .` which puts all the updated files into the git staging area, and `git commit` saves a snapshot of the repository at that time.
>
>`git push` updates your repository on GitHub

### Cloning

Cloning creates a local copy of a repository on your computer, but it does not create a new repository on GitHub/GitLab. Cloning copies all the code, history, and branches of the original repository which becomes your "remote" (called origin) repository. You can update your local repository by pulling changes from orignal repository, but you can't push changes unless you have write access.

**Step 1**

```
# Clone the repository
git clone https://github.com/YOUR-USERNAME/my-project.git

```

**Step 2**
Create a remote repository in your GitHub account.

1. Go to GitHub
2. Click "+" → "New repository (same name as local repository)" 
3. Enter repository name: my-project
4. IMPORTANT: Do NOT initialize with README, .gitignore, or license
5. Click "Create repository"

**Step 3**
Connect your local repository to the remote repository.

```
# Add the remote repository
git remote add origin https://github.com/YOUR-USERNAME/my-project.git

# Verify remote was added
git remote -v
# Output:
# origin  https://github.com/YOUR-USERNAME/my-project.git (fetch)
# origin  https://github.com/YOUR-USERNAME/my-project.git (push)

# Rename branch to main (if needed)
git branch -M main

# Push your code to GitHub
git push -u origin main
The -u flag sets upstream tracking, so future pushes only need git push.
```

**Step 4**

Make changes and save them to the remote repository.

```
# Make changes
git add .
git commit -m "Fix bug"

# Push to YOUR fork
git push origin main
```

>**Explainer:**
>
>The clone workflow creates a local copy but unlike the workflow, it does not create a remote repository.
>
>The remote repository must be created and connected manually for the local repository to save changes.
>
>The cloned repository cannot pull updates from the original repository because `origin` is set to the repository you created.


## Saving to S3

If you have an AWS account and want to save data in S3 for future use, GeoLab includes the AWS CLI and the boto3 package. To use either the CLI or boto3 package, add you AWS credentials as environment variable to your instance.

```
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-2
```

This is an example script to upload a directory of files to an S3 bucket.

```
import boto3
import os
from pathlib import Path
from botocore.exceptions import NoCredentialsError, ClientError

def upload_miniseed_to_s3(local_directory, bucket_name, s3_prefix=''):
    """
    Upload all miniseed files from a local directory to S3 bucket.
    
    Parameters:
    -----------
    local_directory : str
        Path to local directory containing miniseed files
    bucket_name : str
        Name of the S3 bucket
    s3_prefix : str, optional
        Prefix (folder path) in S3 bucket where files will be uploaded
    """
    
    # Initialize S3 client
    s3_client = boto3.client('s3')
    
    # Get all miniseed files
    miniseed_extensions = ['.mseed', '.miniseed', '.ms']
    local_path = Path(local_directory)
    
    if not local_path.exists():
        print(f"Error: Directory {local_directory} does not exist")
        return
    
    # Find all miniseed files
    miniseed_files = []
    for ext in miniseed_extensions:
        miniseed_files.extend(local_path.rglob(f'*{ext}'))
    
    if not miniseed_files:
        print(f"No miniseed files found in {local_directory}")
        return
    
    print(f"Found {len(miniseed_files)} miniseed files")
    
    # Upload each file
    uploaded_count = 0
    failed_count = 0
    
    for file_path in miniseed_files:
        try:
            # Create S3 key (path in bucket)
            relative_path = file_path.relative_to(local_path)
            s3_key = os.path.join(s3_prefix, str(relative_path)).replace('\\', '/')
            
            # Upload file
            print(f"Uploading {file_path.name} to s3://{bucket_name}/{s3_key}")
            s3_client.upload_file(
                str(file_path),
                bucket_name,
                s3_key
            )
            uploaded_count += 1
            
        except FileNotFoundError:
            print(f"Error: File {file_path} not found")
            failed_count += 1
        except NoCredentialsError:
            print("Error: AWS credentials not found")
            return
        except ClientError as e:
            print(f"Error uploading {file_path.name}: {e}")
            failed_count += 1
    
    print(f"\nUpload complete!")
    print(f"Successfully uploaded: {uploaded_count} files")
    print(f"Failed: {failed_count} files")
```

Usage:

```
LOCAL_DIRECTORY = "/path/to/your/miniseed/files"
BUCKET_NAME = "your-bucket-name"
S3_PREFIX = "seismic-data/miniseed"  # Optional: folder in S3
    
# Upload files
upload_miniseed_to_s3(LOCAL_DIRECTORY, BUCKET_NAME, S3_PREFIX)
```

## Converting Notebooks to Python (note: nbconvert issue, ticket sent to 2i2c)

Developing scripts in Jupyter Lab allows you to explore and test ideas. If you want to run your notebooks outside of Jupyter Lab, you can convert notebooks to Python scripts with `nbconvert`. Jupyter includes `nbconvert` which can convert notebooks to different formats.

```
jupyter nbconvert --to python notebook.ipynb
```

Download the script using the `File` > `Download` from the menu bar.

## Saving Graphics (note: nbconvert issue, ticket sent to 2i2c)