Skip to content

Add S3FileLogger and LocalFileLogger #372

Merged
merged 24 commits into from
Dec 15, 2021
Merged

Add S3FileLogger and LocalFileLogger #372

merged 24 commits into from
Dec 15, 2021

Conversation

Mr-Geekman
Copy link
Contributor

@Mr-Geekman Mr-Geekman commented Dec 10, 2021

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Look #344.

Related Issue

#344.

Closing issues

Closes #344.

@Mr-Geekman Mr-Geekman added the enhancement New feature or request label Dec 10, 2021
@Mr-Geekman Mr-Geekman self-assigned this Dec 10, 2021
@codecov-commenter
Copy link

codecov-commenter commented Dec 10, 2021

Codecov Report

Merging #372 (5d1bd81) into master (e4e4b31) will decrease coverage by 0.31%.
The diff coverage is 72.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #372      +/-   ##
==========================================
- Coverage   87.47%   87.16%   -0.32%     
==========================================
  Files          97       98       +1     
  Lines        4736     4908     +172     
==========================================
+ Hits         4143     4278     +135     
- Misses        593      630      +37     
Impacted Files Coverage Δ
etna/loggers/console_logger.py 89.65% <ø> (ø)
etna/loggers/wandb_logger.py 34.61% <12.50%> (+2.43%) ⬆️
etna/loggers/file_logger.py 73.05% <73.05%> (ø)
etna/loggers/__init__.py 100.00% <100.00%> (ø)
etna/loggers/base.py 95.31% <100.00%> (+1.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4e4b31...5d1bd81. Read the comment docs.

@Mr-Geekman Mr-Geekman marked this pull request as ready for review December 10, 2021 15:54
pyproject.toml Outdated
@@ -52,6 +52,8 @@ saxpy = "^1.0.1-dev167"
hydra-slayer = "^0.2.0"
typer = "^0.4.0"
omegaconf = "^2.1.1"
holidays = "^0.11.3"
boto3 = "^1.20.23"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to limit with older versions maybe three or more year old, or use "*".
To avoid dependencies problem.

self._current_experiment_folder = self.experiment_folder.joinpath(f"{job_type}_{group}")
self._current_experiment_folder.mkdir()

def _save_table(self, table: pd.DataFrame, name: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add gzip flag while logger init

"""
super().__init__()
if not os.path.isdir(experiments_folder):
raise ValueError(f"Folder {experiments_folder} doesn't exist")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems we can create folder if it doesn't exist without troubles

raise ValueError("You should start experiment before using log_backtest_run or log_backtest_metrics")
filename = f"{name}.json"
with open(self._current_experiment_folder.joinpath(filename), "w") as ouf:
json.dump(dictionary, ouf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should catch exceptions if some objects not serializable

metrics.columns = ["segment"] + columns_name

try:
self._save_table(metrics, "metrics")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics has strange columns for some reasons

segment,segment,MAE,MSE,MAPE,SMAPE,R2
0,segment_1,7.763682014880712,63.082959455152306,70.50005183059184,110.20437418869618,-27.28124460927375
1,segment_2,6.169644589069107,39.57608243305653,3.397196323227341e+17,165.08659522722252,-20.297610322998022
2,segment_3,4.148008953160193,26.044276348095234,2.2313460234094784e+17,163.44802769939656,-10.475877270195646
3,segment_8,4.1515436266736705,23.73210099641872,33.21865137226771,42.703159606591534,-8.449524500405012
4,segment_0,1.2761743904371403,2.145912681208096,5.64154880927096,5.855982803150363,-1.2013605113854
5,segment_7,4.802545981302062,24.85181975228773,2.956777063102111e+17,90.2986826786166,-21.374242621538958
6,segment_6,4.951371383445379,25.260573194944055,1.6948198758644096e+18,34.42165015381447,-31.770833878003458
7,segment_9,2.7972933788078693,10.25362118755356,1.3365994548979092e+18,24.294594073351483,-3.004909993258983
8,segment_4,2.3343613388059667,7.798571395070086,5.050117708467421,5.19826439721232,-0.4185895718027921
9,segment_5,3.631983273631512,14.911636014941061,1.6744240388728714e+18,19.535007065540476,-19.032146156809308

DATETIME_FORMAT = "%Y-%m-%dT%H-%M-%S"


class BaseFileLogger(BaseLogger):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add config in init like in Wandblogger -- to save experiments params and pipeline.

def _check_bucket(self):
try:
self.s3_client.head_bucket(Bucket=self.bucket)
except ParamValidationError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another error type:

  File "/home/marti/projects/etna-1/etna/loggers/file_logger.py", line 308, in _check_bucket
    self.s3_client.head_bucket(Bucket=self.bucket)
  File "/home/marti/.cache/pypoetry/virtualenvs/etna-hkIKRVwg-py3.8/lib/python3.8/site-packages/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/marti/.cache/pypoetry/virtualenvs/etna-hkIKRVwg-py3.8/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadBucket operation: Not Found

And that's why proper ValueError is not raised

# create subfolder for current experiment
cur_datetime = datetime.datetime.now()
subfolder_name = cur_datetime.strftime(DATETIME_FORMAT)
self.experiment_folder = pathlib.Path(self.experiments_folder).joinpath(subfolder_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to pathlib.Path(self.experiments_folder).resolve() to get full path of experiment

group:
Specify a group to organize individual runs into a larger experiment.
"""
self._current_experiment_folder = self.experiment_folder.joinpath(f"{job_type}_{group}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If job_type=None and group=None I'll get invalid folder name

raise e

# case for aggregate_metrics=False
if "fold_number" in metrics_df.columns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it in external function that is common for all loggers?

# create subfolder for current experiment
cur_datetime = datetime.datetime.now()
subfolder_name = cur_datetime.strftime(DATETIME_FORMAT)
self.experiment_folder = os.path.join(experiments_folder, subfolder_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should change this path concat too.

Copy link
Contributor

@martins0n martins0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

@martins0n martins0n merged commit 39f63c3 into master Dec 15, 2021
@martins0n martins0n deleted the issue-344 branch December 15, 2021 07:29
martins0n pushed a commit that referenced this pull request Dec 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add S3FileLogger and LocalFileLogger
3 participants