Support data extraction for analysis and machine learning.
In order to extract data from several systems, I've combined some of the tools that make this easier.
- Normalize and scale a data series
- One hot or label encode a data series
- Calculate frequencies and frequency percentages
- Summarize variables with basic statistics (mean, sum, min, max, median, standard deviation, etc.)
- Estimate the correlation between variables.
- Extract probabilistic values for Markov models and other probabilistic models.
- Save data into CSV files
For now, the Doctests are the best documentation for using these features. I'll write up some better demonstrations soon.
This is a work in progress. There are several things I'd still like to add.
- Demonstrate the core features
- Extraction via Ecto
- Binning functions
- Test and expand to support GenStage and Flow
- Time series analysis
If available in Hex, the package can be installed as:
- Add
analysis_prepto your list of dependencies inmix.exs:
```elixir
def deps do
[{:analysis_prep, "~> 0.1.0"}]
end
```
- Ensure
analysis_prepis started before your application:
```elixir
def application do
[applications: [:analysis_prep]]
end
```