# Welcome to the ProgPy Dataset Example!

In this notebook, we will illustrate the process of downloading and analyzing a battery dataset from NASA's Prognostics Center of Excellence (PCoE) data repository. We will specifically focus on visualizing the current and voltage profiles of a battery over time.

The steps we will follow are:

1. Download and import the dataset.
2. Access the dataset description.
3. Access and plot the dataset data.

### Importing Modules

In [None]:
import matplotlib.pyplot as plt
import pickle
from progpy.datasets import nasa_battery
# Setting Constant
DATASET_ID = 1

In the first step, we'll download the dataset for a single battery from NASA's Prognostics Center of Excellence (PCoE) data repository. This may take some time depending on the network speed.

For future use and to avoid re-downloading the dataset, we save it to the disk using the Python's built-in `pickle` module. We can easily load this saved data for subsequent analyses.

In [None]:
(desc, data) = nasa_battery.load_data(DATASET_ID)
pickle.dump((desc, data), open(f'dataset_{DATASET_ID}.pkl', 'wb'))

Next, we'll access and print the description of the dataset. This description includes details about the dataset and the procedure followed to collect it. Understanding this description is crucial as it provides context for the data and can help guide subsequent analyses.

In [None]:
print(f'\nDataset {DATASET_ID}')
print(desc['description'])
print(f'Procedure: {desc["procedure"]}')

In this step, we access the data in the dataset and analyze a specific run. We understand that the data is in the format `[run_id][time][variable]`, which makes it easy to access specific aspects of the data.

We specifically focus on the `current` and `voltage` profiles of the battery for the fourth run in the dataset. We create a two-panel subplot, with the top panel showing the current over time and the bottom panel showing the voltage over time.

Lastly, we graph all the reference discharge profiles available in the dataset. These are the profiles where the battery was discharged without any rest. We plot the voltage over time for each of these profiles. Understanding these profiles can give us insights into how the battery performs under continuous discharge.

In [None]:
print(f'\nNumber of runs: {len(data)}')
print(f'\nAnalyzing run 4')
print(f'number of time indices: {len(data[4])}')
print(f"Details of run 4: {desc['runs'][4]}")

Let's plot the run and graph all reference discharge profiles

In [None]:
# Plot the run
plt.figure()
plt.subplot(2, 1, 1)
plt.plot(data[4]['relativeTime'], data[4]['current'])
plt.ylabel('Current (A)')

plt.subplot(2, 1, 2)
plt.plot(data[4]['relativeTime'], data[4]['voltage'])
plt.ylabel('Voltage (V)')
plt.xlabel('Time (s)')
plt.title('Run 4')

# Graph all reference discharge profiles
indices = [i for i, x in enumerate(desc['runs']) if 'reference discharge' in x['desc'] and 'rest' not in x['desc']]
plt.figure()
for i in indices:
    plt.plot(data[i]['relativeTime'], data[i]['voltage'], label=f"Run {i}")
plt.title('Reference discharge profiles')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()

## Conclusion

In this notebook, we demonstrated how to download a battery dataset from NASA's PCoE data repository, read the dataset's description, and access its data. We visualized the current and voltage profiles of a particular battery run, and plotted the voltage profiles for all reference discharge profiles.

This approach provides a robust method for conducting preliminary data analysis on prognostic datasets. These insights are crucial for designing prognostic models and understanding system behavior.

For more ProgPy related information, please refer to our ProgPy [Documentation](https://nasa.github.io/progpy/).