# Example: Working with Delimited Numerical Data Files
A super common thing that data scientists and artificial intelligence researchers (as well as engineers and scientists) do is load textual and numerical datasets, analyze them, and save them back to disk. One of the most common data storage formats is [Comma-Separated Values (CSV) files](https://en.wikipedia.org/wiki/Comma-separated_values). Let's figure out how to read and manipulate the data in these types of files.

### Learning objectives
In this example, we will develop our first `parser,` i.e., a piece of code to read and interpret [Comma-Separated Values (CSV) files](https://en.wikipedia.org/wiki/Comma-separated_values), which are a popular format for storing tabular data in plain text. The `parser` processes each line of [the CSV file](https://en.wikipedia.org/wiki/Comma-separated_values), separating the values based on commas (or other specified delimiters), and converts them into a structured format, such as arrays or [structs](https://docs.julialang.org/en/v1/base/base/#struct), allowing for easier data manipulation and analysis in applications. 

We'll start this discussion now in the lecture, but we'll continue to work on these ideas in `Lab-3d.`. So let's get started!
* __Task 1__: Load the `Bubble.csv` file. We profiled our [bubble sort implementation](https://en.wikipedia.org/wiki/Bubble_sort) and saved the data into the `Bubble.csv` file. Let's develop code to read this data from the file system.

## Setup
In this example, we'll use functions defined in the `Files. jl` to read a numerical data file. In the `Include.jl` file, we load these functions to access them and set some required paths for this example.

In [1]:
include("Include.jl");

[32m[1m  Activating[22m[39m project at `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-3`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-3/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-3/Manifest.toml`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-3/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-3/Manifest.toml`


## Task 1: Load the `Bubble.csv` file
To get a better idea of the performance of the [Bubble Sort algorithm](https://en.wikipedia.org/wiki/Bubble_sort), I used the [BenchmarkTools.jl package](https://github.com/JuliaCI/BenchmarkTools.jl) to do some benchmarking of the `mean` and `standard deviation` of the time required to sort a random array of integers as a function of the length the vector. We stored this data in the [Bubblesort.csv](data/Bubblesort.csv) file.

* Let's load this data file using the `simplereadcsvfile` function defined in `Files.jl`. First, let's set the path to the data file in the `path_to_data_file` variable:

In [2]:
path_to_data_file = joinpath(_PATH_TO_DATA, "Bubblesort.csv")

"/Users/jeffreyvarner/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-3/data/Bubblesort.csv"

Next, we pass the `path_to_data_file` variable to the [`simplereadcsvfile` function](src/Files.jl). The `simplereadcsvfile` returns two items: the `header`, i.e., the column names and the data stored in the file:

In [3]:
(bubble_header, bubble_data) = simplereadcsvfile(path_to_data_file);

### Let's make a list of issues with the `simplereadcsvfile`
1. Hmmm