# DataParsers

```{image} imgs/dataparsers.png
:align: center
```

## What is a DataParser?

The dataparser returns `DataparserOutputs`, which puts all the various datasets into a common format.

```python
@dataclass
class DataparserOutputs:
    """Dataparser outputs for the image dataset and the ray generator."""

    image_filenames: List[Path]
    """Filenames for the images."""
    cameras: Cameras
    """Camera object storing collection of camera information in dataset"""
    alpha_color: Optional[TensorType[3]] = None
    """color of dataset background"""
    scene_box: SceneBox = SceneBox()
    """scene box of dataset. Could be used to bound the scene, or to provide the scene scale depending on model."""
    ...

@dataclass
class DataParser:

    @abstractmethod
    def _generate_datapaser_outputs(self, split: str = "train") -> DataparserOutputs:
        """Abstract method that returns the dataparser outputs for the given split.

        Args:
            split: Which dataset split to generate data for (train/eval/test).

        Returns:
            DataparserOutputs for the specified dataset and split.
        """
```

## Example

Here is an example where we implement a DataParser for our Luxenstudio data format.

```python
@dataclass
class LuxenstudioDataParserConfig(DataParserConfig):
    """Luxenstudio dataset config"""

    _target: Type = field(default_factory=lambda: Luxenstudio)
    """Target class to instantiate."""
    data_directory: Path = Path("data/luxenstudio/poster")
    """Directory specifying location of data."""
    scale_factor: float = 1.0
    """How much to scale the camera origins by."""
    downscale_factor: int = 1
    """How much to downscale images."""
    scene_scale: float = 1.0
    """How much to scale the region of interest by."""
    orientation_method: Literal["pca", "up"] = "up"
    """The method to use for orientation."""
    train_split_percentage: float = 0.9
    """The percent of images to use for training. The remaining images are for eval.
    """

@dataclass
class Luxenstudio(DataParser):
    """Luxenstudio Dataset"""

    config: LuxenstudioDataParserConfig

    def _generate_datapaser_outputs(self, split="train") -> DataParserOutputs:
      """Returns DataParserOutputs."""
```

## Our Implementations

###### Luxenstudio

This is our custom dataparser.

```{button-link} https://github.com/luxenstudio-project/luxenstudio/blob/master/luxenstudio/data/dataparsers/luxenstudio_dataparser.py
:color: primary
:outline:
See the code!
```

###### Blender

We support the synthetic Blender dataset from the original Luxen paper.

```{button-link} https://github.com/luxenstudio-project/luxenstudio/blob/master/luxenstudio/data/dataparsers/blender_dataparser.py
:color: primary
:outline:
See the code!
```

###### Instant NGP

This supports the Instant NGP dataset.

```{button-link} https://github.com/luxenstudio-project/luxenstudio/blob/master/luxenstudio/data/dataparsers/instant_ngp_dataparser.py
:color: primary
:outline:
See the code!
```

###### MipLuxen 360

```{button-link} https://github.com/luxenstudio-project/luxenstudio/blob/master/luxenstudio/data/dataparsers/mipluxen_dataparser.py
:color: primary
:outline:
See the code!
```

###### Record3D

This dataparser can use recorded data from a >= iPhone 12 Pro using the Record3D app. Record a video and export with the `EXR + JPG sequence` format. Unzip export and `rgb` folder before training.

```{button-link} https://github.com/luxenstudio-project/luxenstudio/blob/master/luxenstudio/data/dataparsers/record3d_dataparser.py
:color: primary
:outline:
See the code!
```
