In [2]:
from reprolab.environment import create_new_venv
create_new_venv('my_venv')

[✔] Virtual environment 'my_venv' created at /Users/spoton/Documents/master_thesis/poc/reprolab/my_venv
[✔] Pip upgraded
[✔] Installed essential packages: ipykernel, boto3, ipylab, pandas, numpy, xarray, requests, pyarrow, nbformat, pyyaml, ipywidgets
Installed kernelspec my_venv_kernel in /Users/spoton/Library/Jupyter/kernels/my_venv_kernel
[✔] Kernel 'my_venv_kernel' registered for Jupyter

🎉 Setup complete!
➡ To use the virtual environment in Jupyter:
   1. Restart your Jupyter server
   2. Select kernel: Python (my_venv)


# ReproLab Demo

Welcome to ReproLab! This extension helps you make your research more reproducible.

## Features

- **Create Experiments**: Automatically save immutable snapshots of your code under `git` tags to preserve the **exact code and outputs**
- **Manage Dependencies**: Automatically gather and pin **exact package versions**, so that others can set up your environment with one command
- **Cache Data**: Call external API/load manually dataset only once, caching function will handle the rest
- **Archive Data**: Caching function can also preserve the compressed data in *AWS S3*, so you always know what data was used and reduce the API calls
- **Publishing guide**: The reproducibility checklist & automated generation of reproducability package make publishing to platforms such as Zenodo very easy

## Getting Started

1. Use the sidebar to view ReproLab features
2. Create virtual environment and pin your dependencies, go to reprolab section `Create reproducible environment` 
3. Create an experiment to save your current state, go to reprolab section `Create experiment`
4. Archive your data for long-term storage, go to reprolab section `Demo` and play around with it.
5. Publish your work when ready, remember to use reproducability checklist from the section `Reproducibility Checklist`

## Example Usage of persistio decorator

To cache and archive the datasets you use, both from local files and APIs we developed a simple decorator that put over your function that gets the datasets caches the file both locally and in the cloud so that the dataset you use is archived and the number of calls to external APIs is minimal and you don't need to keep the file around after you run it once.

Here is an example using one of NASA open APIs. If you want to test it out yourself, you can copy the code, but you need to provide bucket name and access and secret key in the left-hand panel using the `AWS S3 Configuration` section.

```python
import requests
import pandas as pd
from io import StringIO

# The two lines below is all that you need to add
from reprolab.experiment import persistio
@persistio()
def get_exoplanets_data_from_nasa():
    url = "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"

    query = """
    SELECT TOP 10
        pl_name AS planet_name,
        hostname AS host_star,
        pl_orbper AS orbital_period_days,
        pl_rade AS planet_radius_earth,
        disc_year AS discovery_year
    FROM
        ps
    WHERE
        default_flag = 1
    """

    params = {
        "query": query,
        "format": "csv"
    }

    response = requests.get(url, params=params)

    if response.status_code == 200:
        df = pd.read_csv(StringIO(response.text))
        
        print(df)
        
    else:
        print(f"Error: {response.status_code} - {response.text}")
    return df

exoplanets_data = get_exoplanets_data_from_nasa()
```

If you run this cell twice you will notice from the logs that the second time file was read from the compressed file in the cache. If you were to lose access to local cache (e.g. by pulling the repository using different device) `persistio` would fetch the data from the cloud archive.


For more information, visit our [documentation](https://github.com/your-repo/reprolab). 


In [2]:
import requests
import pandas as pd
from io import StringIO

# The two lines below is all that you need to add
from reprolab.experiment import persistio
@persistio()
def get_exoplanets_data_from_nasa():
    url = "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"

    query = """
    SELECT TOP 10
        pl_name AS planet_name,
        hostname AS host_star,
        pl_orbper AS orbital_period_days,
        pl_rade AS planet_radius_earth,
        disc_year AS discovery_year
    FROM
        ps
    WHERE
        default_flag = 1
    """

    params = {
        "query": query,
        "format": "csv"
    }

    response = requests.get(url, params=params)

    if response.status_code == 200:
        df = pd.read_csv(StringIO(response.text))
        
        print(df)
        
    else:
        print(f"Error: {response.status_code} - {response.text}")
    return df

exoplanets_data = get_exoplanets_data_from_nasa()


[persistio] Function: get_exoplanets_data_from_nasa
[persistio] Hash: ca840447667cb2059aa83ed68ec9e995
✅ Metadata written to Untitled.ipynb_persistio_archive.yaml
[persistio] Trigger logged for function: get_exoplanets_data_from_nasa
[persistio] Attempting to load from local cache...
[persistio] Successfully loaded from local cache!


In [1]:
from reprolab.experiment import download_notebook_cache_package
download_notebook_cache_package('Untitled.ipynb')

[download_notebook_cache_package] Processing notebook: Untitled.ipynb
[download_notebook_cache_package] Found 1 cached functions
[download_notebook_cache_package] Using cloud storage: viciooo-dvc-testing
[download_notebook_cache_package] Copied local file: ca840447667cb2059aa83ed68ec9e995.DataFrame.parquet
[download_notebook_cache_package] Creating zip package: Untitled.ipynb_cache_package_20250620_222851.zip
[download_notebook_cache_package] ✅ Successfully created package: Untitled.ipynb_cache_package_20250620_222851.zip
[download_notebook_cache_package] 📦 Package contains 1 cached files + metadata


'Untitled.ipynb_cache_package_20250620_222851.zip'

In [1]:
import ipywidgets as widgets
from IPython.display import display

# Create a dropdown widget with a list of options
options = ['Option 1', 'Option 2', 'Option 3']
dropdown = widgets.Dropdown(
    options=options,
    value=options[0],
    description='Select:',
)

# Create a button widget
button = widgets.Button(description="Print Selection")

# Define the button click event handler
def on_button_click(b):
    print(f"Selected value: {dropdown.value}")

# Attach the event handler to the button
button.on_click(on_button_click)

# Display the widgets
display(dropdown)
display(button)

Dropdown(description='Select:', options=('Option 1', 'Option 2', 'Option 3'), value='Option 1')

Button(description='Print Selection', style=ButtonStyle())