In [None]:
import numpy as np
import datetime

from btgym import BTgymEnv, BTgymDataset
from btgym.datafeed import BTgymRandomDataDomain, BTgymSequentialDataDomain

from logbook import WARNING, INFO, DEBUG

- BTGym data is basically discrete timeflow of equitype records. For the sake of
defining episodic MDP over such data and setting formal problem objective it should
be somehow structured. 

- This notebook is brief introduction to API realisation of formal definitions introduced in Section 1 (Data) from   this draft: https://github.com/Kismuz/btgym/blob/master/docs/papers/btgym_formalism_draft.pdf
- objects described here can be thought as nested data containers with bult-in properties like sampling and         splitting data to train and test subsets.

In [None]:
# Make data domain - top-level data structure:
domain = BTgymRandomDataDomain(
    filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
    target_period={'days': 29, 'hours': 0, 'minutes': 0},
    trial_params=dict(
        start_weekdays={0, 1, 2, 3, 4, 5, 6},
        sample_duration={'days': 10, 'hours': 0, 'minutes': 0},
        start_00=True,
        time_gap={'days': 5, 'hours': 0},
        test_period={'days': 2, 'hours': 0, 'minutes': 0},
    ),
    episode_params=dict(
        start_weekdays={0, 1, 2, 3, 4, 5, 6},
        sample_duration={'days': 0, 'hours': 23, 'minutes': 55},
        start_00=False,
        time_gap={'days': 0, 'hours': 10},
    ),
    log_level=INFO, # Set to DEBUG to see more output
)

Here Domain instanse is defined such as:
- Holds one yera of 1min bars data;
- Splits data to source and target domains, target domain data gets last 29 days of year period;
- Defines each Trial to consist of 8 days of train and following 2 days of test data;
- Each Episode lasts maximum 23:55.

##### Manually perfom sampling cycle:

- Prepare domain, it's essential for stateful classes such as BTgymSequentialDataDomain:

In [None]:
domain.reset()

##### Sample trial from source domain:

In [None]:
trial= domain.sample(sample_type=0)
trial.reset()

- Trials can be sampled from `Source` or `Target` domains by passing kwarg `sample_type` to `reset()`         method. 0 is for `source` and 1 is for `target` and later are conventional names for train ans test data.
- If target period duration is set to 0:0:0, trying to get target sample will rise an exeption.
- Note that during real BTgym operation domain instance is held by `btgym_data_server`;

- Different Trial samples are sent to every environment instance, so corresponding `btgym_server` can sample           multiple episodes until agent decides to request another trial via Gym API by calling env.reset() 
  with kwarg `new_trial`=True.


##### Sample episode from trial test interval:

In [None]:
episode = trial.sample(sample_type=1)

- Episodes can be sampled from Trial `train` or `test` subsets, just like Trials from Source/Target domains e.g:       
    - env.reset(new_trial=False, sample_type=1) will ask for test episode from same trial;
    
    - env.reset(new_trial=True, sample_type=0) asks for train episode from new trial (... means requset is passed from API shell to `btgym_server` wich itself request new episode from `data_server` and than samples episode from freshly obtained trial).
    
        
- To be implemented soon: pass kwargs `b_alpha` and `b_beta` to get skewed train data sampling.

##### Now print whole path:

In [None]:
print('Got instance of: {}\nholding data: {}\nmetadata: {}'.
      format(type(domain), domain.filename, domain.metadata))

print('  |\nsample()\n  |')

print('got instance of: {}\nholding data: {}\nmetadata: {}'.
      format(type(trial), trial.filename, trial.metadata))

print('  |\nsample()\n  |')

print('got instance of: {}\nholding data: {}\nmetadata: {}'.
      format(type(episode), episode.filename, episode.metadata))

print('  |\nto_btfeed()\n  |')

print('got instance of: {},\n...wich ready to be fed to bt.Cerebro'.format(type(episode.to_btfeed())))

##### Note:
- BTgymDataSet class used in most examples is simply special case where we set Trial=Episode by definition.
- This nested data structure is intended mostly for upcoming implementation of meta-learning and guided-policy-search algorithms.
- using kwargs in .reset() method is Gym API extension. Nevermind.