Skip to content

TimeSeries

Clifford Bohm edited this page Jan 24, 2021 · 13 revisions

Time series are lists of lists of numbers. Each outer list represents a set of data recorded over time (samples) and each inner list represents individual pieces of data recorded at each time. Here we are primarily interested in time series that relate to brain states (inputs, outputs, and hidden/internal states) and world states.

Entropytools exist in the ENT:: namespace. To use time series functions and objects:

TS::Join(X,Y);

data structure

In MABE, time series are just vector<vector>. We also define the idea of an integer time series (intTimeSeries), defined as a vector<vector> used to store discreet value time series. The TS:timeSeries and TS:intTimeSeries types are defined with typedefs:

typedef std::vector<std::vector<double>> TimeSeries;
typedef std::vector<std::vector<int>> intTimeSeries;

continuous value vs. discreet value time series

Generally speaking, we will collect data either into a timeSeries(continuous value) or intTimeSeries(discreet value). But all of the information theory functions (and the other functions described below) only operate on intTimeSeries. The continuous value definition was provided so that data could be collected without having to make decisions about how to discretize that data. To convert continuous value time series to discreet value time series you need to use a discretization function like TS::remapToIntTimeSeries(). Discretization functions take a timeSeries and convert it to an intTimeSeries using some rule. That rule may be simple (e.g. for every value in the time series, [if <= 0, set to 0] and [if > 0 set to 1]) or more complex (for each entry in each sample, find the median across all samples and then threshold values in that entry position in all samples to 0 or 1 based on this median value).

If you are interested in converting time series, you can skip to the bottom of this page.

creating and adding to timeSeries or intTimeSeries

Good news, these are just vectors or vectors, so all the normal vector operations are valid!

Enum classes

Two enum classes are provided... these are used by functions described below.

enum class Position { FIRST, LAST }; // used with trimTimeSeries and extendTimeSeries
enum class RemapRules { INT, BIT, TRIT, NEAREST_INT, NEAREST_BIT, NEAREST_TRIT, MEDIAN }; // used with remapTimeSeries

to string and visualization tools

// convert one sample from an intTimeSeries to a string, sep will be placed between elements
std::string TimeSeriesSampleToString(const std::vector<int>& sample, const std::string& sep = " ");
	
// convert an intTimeSeries to a vector<string>, sep will be placed between elements
std::vector<std::string> TimeSeriesToVectString(const intTimeSeries& X, const std::string& sep = " ");

// convert an intTimeSeries to a string, sepSample will be placed between samples, and sepElement between elements in each sample
std::string TimeSeriesToString(const intTimeSeries& X, const std::string& sepElement = " ", const std::string& sepSample = "\n");

intTimeSeries manipulation tools

// given a intTimeSeries and a list of indices, return a new intTimeSeries where each state 
// only has elements indicated by indices
// example indices: {0,1,3}
// old intTimeSeries    new intTimeSeries
// 4,5,3,2,1    ->    4,5,2
// 3,1,2,1,1    ->    3,1,1
// 9,9,1,2,3    ->    9,9,2
intTimeSeries subSetTimeSeries(const intTimeSeries& X, const std::vector<int>& indices);

// given a intTimeSeries (i.e. vector<vector<int>>) X return a vector<intTimeSeries>
// were each element of states of X is now it's own intTimeSeries
// (used to isolate features, for example, when calculating fragmentation)
std::vector<intTimeSeries> deconstructTimeSeries(const intTimeSeries& X);

join

join intTimeSeries; this results in a new intTimeSeries with the same number of samples and where each sample is a concatenation of the samples of the original time series. You can provide X and Y time series, or a vector of time series if you need to join more then 2.

// given 2 TimeSeriess (X and Y) return a new intTimeSeries where each state is {X0,X1,..,Xn,Y0,Y1,...,Yn}
intTimeSeries Join(const intTimeSeries& X, const intTimeSeries& Y);

// given a vector of TimeSeriess (X and Y) return a new intTimeSeries where each state is {X0,X1,..,Xn,Y0,Y1,...,Yn}
intTimeSeries Join(const std::vector<intTimeSeries>& data);

trim

The collection of trim functions are used to reduce the size of an intTimeSeries based on lifeTime information. If a time series should be considered as a whole then it is one lifetime. Four versions of the trim function are provided. These can be divided into range vs FIRST/LAST, where range indicates a range from within the time series to keep and FIRST/LAST indicates to trim of exactly n samples from the beginning or end. For range and FIRST/LAST versions, there are two ways to define lifeTimes: either as a vector or simple a number of lifetimes (at which point the function will assume all lifetimes are the same length - and generate a lifetimes list if possible, or generate an error if not).

// given a intTimeSeries (experience), a range (START,END) from [0.0,...,1.0], and either a number of lives or a list of lifeTimes
// return new intTimeSeries where for each life states before START and after END are removed
// If lives is used, (experience.size() / lives) must be an int (assumes all lives are the same length).
// If lifeTimes is used, then sum(lifeTimes) must equal experiance.size()
intTimeSeries trimTimeSeries(const intTimeSeries& experience, const std::pair<double, double>& range, const std::vector<int>& lifeTimes);
intTimeSeries trimTimeSeries(const intTimeSeries& experience, const std::pair<double, double>& range, size_t lives);

// remove n samples the start (FIRST) or end (LAST) from each lifetime of an intTimeSeries and remove a new time series
// If lives is used, (experience.size() / lives) must be an int (assumes all lives are the same length).
// If lifeTimes is used, then sum(lifeTimes) must equal experiance.size()
//   - OR -
//     sum(lifeTimes) + lifeTimes.size() == experiance.size() - which case, each lifeTime will be assued to be 1 larger
//     this is to address the issue casued by oversized TS - note, n = 1 in this case will result in a new TS that matches lifeTimes
//     i.e. hidden is generaly recorded at T+1, but on the first update must also be recored at T - resutling in 1 extra sample per lifetime
intTimeSeries trimTimeSeries(const intTimeSeries& experience, const Position& removeWhich, const std::vector<int>& lifeTimes, int n = 1);
intTimeSeries trimTimeSeries(const intTimeSeries& experience, const Position& removeWhich, size_t lives, int n = 1);

trim and bloated time series

sometimes a time series will have 1 extra entry per lifetime, a bloated time series. This generally results from the fact that hidden states (i.e. recurrent states) must exist both before and after every brain update and lifetimes is a list of counts of calls to brain update. The trim functions are aware of bloated time series and will assume that if sum(lifeTimes)+lifeTimes.size() = timeSeries.size() then the time series is bloated and will act accordingly.

extend

It is sometimes useful to add samples to a time series. The two versions of this function (one takes lifeTimes, the other lives) will add n copies of a provided sample to the beginning (FIRST) or end (LAST) position of a given intTimeSeries

// given intTimeSeries X and state Y, concat Y at the begining or end of every life in X , n times
intTimeSeries extendTimeSeries(const intTimeSeries& X, const std::vector<int>& lifeTimes, const std::vector<int> Y, Position addWhere, int n = 1);
intTimeSeries extendTimeSeries(const intTimeSeries& X, const size_t lives, const std::vector<int> Y, Position addWhere, int n = 1);

###updateLifeTimes This function takes a lifeTimes and updates each element by n. This is useful if you have altered the size of a time series and also still need lifeTimes to be correct (for example, if you are going to do more manipulations).

// given a lifeTimes list (i.e. list of lifeTimes) add n to each lifetime (n may be negative) and return a new lifeTimes list
std::vector<int> updateLifeTimes(const std::vector<int>& lifeTimes, int n);

remapToIntTimeSeries

Before we get to this function, it is important to understand that the discretization function (that's what this function is all about!) can significantly alter your results. You should think carefully about exactly what function you use. That being said, the function described here provides a set of discretization functions that can be useful and which is integrated into other higher lever functions. There is nothing stopping you from developing your own discretization functions if needed! If you do, you may need to rewrite some higher-level functionality, but this should not amount to more than some copy/paste with minor alterations.

The remapToIntTimeSeries function takes a time series, a remap rule (from the enum class RemapRules) and an optional vector remap parameter. This function is a semi-temporary placeholder - as it should probably make use of something like function pointers or lambdas - but it will do for now.

// given a TimeSeries X and a mapping rule, return a new intTimeSeries based on rule
intTimeSeries remapToIntTimeSeries(const TimeSeries& X, RemapRules rule, std::vector<double> ruleParameter = { -1 });

The available remap rules are:
INT - preform (int) on every element of every sample
BIT - preform BIT() (i.e. <= 0 maps to 0, and > 0 maps to 1) on every element of every sample
TRIT - preform TRIT() (i.e. < 0 maps to -1, 0 to 0 and > 0 maps to 1) on every element of every sample
NEAREST_INT - preform round to nearest int on every element of every sample
NEAREST_BIT - preform round to nearest bit on every element of every sample
NEAREST_TRIT - preform round to nearest trit (-1,0,1) on every element of every sample
MEDIAN(n) - for each "column" find the median. values greater than median are set to 1, less to 0. If n is provided (as ruleParameter = {n}) then the n way median will be established and resulting values will be integers from [0..n-1].

Clone this wiki locally