# PyKOALA Data Containers

## Table of contents:

1. [Importing class](#importing-class)
2. [`HistoryRecord` class](#historyrecord-class)
    - [`HistoryRecord` methods](#historyrecord-methods)
        - [`to_str`](#to_str)
3. [`DataContainerHistory` class](#datacontainerhistory-class)
    - [`DataContainerHistory` methods](#datacontainerhistory-methods)
        - [`initialise_record`](#initialise_record)
        - [`log_record`](#log_record)
        - [`is_record`](#is_record)
        - [`find_record`](#find_record)
        - [`dump_to_header`](#dump_to_header)
        - [`dump_to_text`](#dump_to_text)


## Importing class

**Note: Make sure to run the following cells in order to ensure correct execution.**

PyKOALA uses data containers to organise, read, and manage input/output. `PyKOALA` data containers can be imported from `data_container` module:

In [2]:
import pykoala.data_container 

Alternatively, users can import explicitly the requires classes. In this tutorial, we'll demonstrate examples using this approach.

## `HistoryRecord` class

The `HistoryRecord` class represents a structured record in a log. This is typically useful for logging, note-taking, or any system that keeps track of messages or events.

In [3]:
from pykoala.data_container import HistoryRecord

hist_record = HistoryRecord(title="Error log",comments="This is line one.\nThis is line two.")
print('Title of history record: ', hist_record.title)
print('Comments in history record: ', hist_record.comments)

Title of history record:  Error log
Comments in history record:  ['This is line one.', 'This is line two.']


### `HistoryRecord` **methods** 

#### `to_str`

`HistoryRecord` can convert the record into a string with `to_str`:

In [4]:
hist_record.to_str()

'Error log: This is line one.\nThis is line two.'

However, this output is hard to read. Since it is an iterable, a better way to show the comments is: 

In [5]:
for record in hist_record.comments:
    print(record)

This is line one.
This is line two.


## `DataContainerHistory` class

The `DataContainerHistory` class is designed to store and manage the history of data reduction steps by creating a log of entries, which can be used to trace the steps applied to a dataset. It can log a sequence of entries where each entry records details of a data processing step.

In [6]:
from pykoala.data_container import DataContainerHistory
initial_entries = [
    ("Step 1", "Loaded data"),
    ("Step 2", "Filtered outliers"),
    HistoryRecord("Step 3", "Normalized data", "preprocessing")
]

history_log = DataContainerHistory(list_of_entries=initial_entries)

# Accessing the recorded entries
for record in history_log.record_entries:
    print(record.to_str())


Step 1: Loaded data
Step 2: Filtered outliers
Step 3: Normalized data


### `DataContainerHistory` **methods** 

`DataContainerHistory` class use several methods to process data. For example, we can start explicitly a new log from the class with `initialise_record`. Let's redo the previous cell with this method: 

#### `initialise_record` 

In [7]:
initial_entries = [
    ("Step 1", "Loaded data"),
    ("Step 2", "Filtered outliers"),
    HistoryRecord("Step 3", "Normalized data", "preprocessing")
]

history_log = DataContainerHistory()
history_log.initialise_record(list_of_entries=initial_entries)

history_log.show()

Step 1: Loaded data
Step 2: Filtered outliers
Step 3: Normalized data


In the cell above we also used the `show()` method to present the content of the history log. 

#### `log_record`

We can add a new record in an existing history log by using the `log_record` method:

In [8]:
#We use the same history log as before
initial_entries = [
    ("Step 1", "Loaded data"),
    ("Step 2", "Filtered outliers"),
    HistoryRecord("Step 3", "Normalized data", "preprocessing")
]

history_log = DataContainerHistory()
history_log.initialise_record(list_of_entries=initial_entries)

history_log.log_record(title='Step 4', comments='Additional data added with log_record method') 

history_log.show()


Step 1: Loaded data
Step 2: Filtered outliers
Step 3: Normalized data
Step 4: Additional data added with log_record method


#### `is_record` 

We can find if a certain record (title) is present in the history log with the `is_record` method. Following the previous cell:

In [9]:
print(history_log.is_record(title='Step 1'))
print(history_log.is_record(title='Step 42'))

True
False


In the first case, the output is `True` because there is a record with the title `Step 1`. However, there is no record with the title `Step 42`, so the second line returns `False`.

#### `find_record`

We can find, save and display the content using `find_record`:

In [10]:
search_results = history_log.find_record(title='Step 1')
print('Number of found items:', len(search_results))

for result in search_results:
    print(result.to_str())

Number of found items: 1
Step 1: Loaded data


The output of this method is given in a list. This is useful when several entries have the same `title` record:

In [11]:
history_log_repeated_titles = DataContainerHistory()
history_log_repeated_titles.initialise_record(list_of_entries=initial_entries)

history_log_repeated_titles.log_record(title='Step 1', comments='Rinse and repeat') #We add a record with the same title as an existing one.

search_results = history_log_repeated_titles.find_record(title='Step 1')
print('Number of found items:', len(search_results))

for result in search_results:
    print(result.to_str())

Number of found items: 2
Step 1: Loaded data
Step 1: Rinse and repeat


#### `dump_to_header`

This method writes the log into a astropy.fits.Header

In [12]:
from astropy.io import fits
history_log.dump_to_header()


PYKOALA0= 'Loaded data'        / Step 1                                         
PYKOALA1= 'Filtered outliers'  / Step 2                                         
PYKOALA2= 'Normalized data'    / Step 3                                         
PYKOALA3= 'Additional data added with log_record method' / Step 4               

#### `dump_to_text`

This method writes the log into a text file

In [13]:
from pathlib import Path
import os

os.system('pwd')
path_to_file = './output/history_log.txt'
history_log.dump_to_text(file=path_to_file)

/home/mbolivar/pykoala-tutorials/tutorials/0-introduction
[pykoala] 2024/11/22 13:54|INFO> Writting log into text file


You can confirm that the file was created in the `output` directory.