In [1]:
# Read and display the ReproLab documentation
from IPython.display import Markdown
import os

# Read the markdown file
with open('reprolab_data/demo.md', 'r') as f:
    content = f.read()

# Display the markdown content
Markdown(content)

# ReproLab Demo

Welcome to ReproLab! This extension helps you make your research more reproducible.

## Features

- **Create Experiments**: Save immutable snapshots of your code and data
- **Track Metrics**: Monitor execution time and resource usage
- **Manage Dependencies**: Automatically gather and pin package versions
- **Archive Data**: Store your data securely in AWS S3
- **Publish**: Share your work on Zenodo

## Getting Started

1. Use the sidebar to access ReproLab features
2. Create an experiment to save your current state
3. Track metrics to monitor performance
4. Archive your data for long-term storage
5. Publish your work when ready

## Example Usage of persistio decorator

To cache and archive the datasets you use, both from local files and APIs we developed a simple decorator that put over your function that gets the datasets caches the file both locally and in the cloud so that the dataset you use is archived and the number of calls to external APIs is minimal and you don't need to keep the file around after you run it once.

Here is an example using one of NASA open APIs. If you want to test it out yourself, you can copy the code, but you need to provide bucket name and access and secret key in the left-hand panel using the `AWS S3 Configuration` section.

```python
import requests
import pandas as pd
from io import StringIO

# The two lines below is all that you need to add
from reprolab import persistio
@persistio()
def get_exoplanets_data_from_nasa():
    # Define the NASA Exoplanet Archive TAP endpoint
    url = "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"

    # Define the ADQL query to fetch 10 exoplanets with basic properties
    query = """
    SELECT TOP 10
        pl_name AS planet_name,
        hostname AS host_star,
        pl_orbper AS orbital_period_days,
        pl_rade AS planet_radius_earth,
        disc_year AS discovery_year
    FROM
        ps
    WHERE
        default_flag = 1
    """

    # Encode the query for the URL
    params = {
        "query": query,
        "format": "csv"  # Request CSV format for easy parsing
    }

    # Make the GET request
    response = requests.get(url, params=params)

    # Check if the request was successful
    if response.status_code == 200:
        # Read CSV data into a DataFrame
        df = pd.read_csv(StringIO(response.text))
        
        # Print the DataFrame
        print(df)
        
    else:
        print(f"Error: {response.status_code} - {response.text}")
    return df

exoplanets_data = get_exoplanets_data_from_nasa()
```

For more information, visit our [documentation](https://github.com/your-repo/reprolab). 


In [3]:
import requests
import pandas as pd
from io import StringIO

# The two lines below is all that you need to add
from reprolab import persistio
@persistio()
def get_exoplanets_data_from_nasa():
    # Define the NASA Exoplanet Archive TAP endpoint
    url = "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"

    # Define the ADQL query to fetch 10 exoplanets with basic properties
    query = """
    SELECT TOP 10
        pl_name AS planet_name,
        hostname AS host_star,
        pl_orbper AS orbital_period_days,
        pl_rade AS planet_radius_earth,
        disc_year AS discovery_year
    FROM
        ps
    WHERE
        default_flag = 1
    """

    # Encode the query for the URL
    params = {
        "query": query,
        "format": "csv"  # Request CSV format for easy parsing
    }

    # Make the GET request
    response = requests.get(url, params=params)

    # Check if the request was successful
    if response.status_code == 200:
        # Read CSV data into a DataFrame
        df = pd.read_csv(StringIO(response.text))
        
        # Print the DataFrame
        print(df)
        
    else:
        print(f"Error: {response.status_code} - {response.text}")
    return df

exoplanets_data = get_exoplanets_data_from_nasa()


[persistio] Function: get_exoplanets_data_from_nasa
[persistio] Hash: cc52de54e2adb1fa9730d94a6caea650
[persistio] Attempting to load from local cache...
[persistio] Successfully loaded from local cache!
