-
Notifications
You must be signed in to change notification settings - Fork 80
Add S3FileLogger and LocalFileLogger #372
Conversation
# Conflicts: # CHANGELOG.md
Codecov Report
@@ Coverage Diff @@
## master #372 +/- ##
==========================================
- Coverage 87.47% 87.16% -0.32%
==========================================
Files 97 98 +1
Lines 4736 4908 +172
==========================================
+ Hits 4143 4278 +135
- Misses 593 630 +37
Continue to review full report at Codecov.
|
pyproject.toml
Outdated
@@ -52,6 +52,8 @@ saxpy = "^1.0.1-dev167" | |||
hydra-slayer = "^0.2.0" | |||
typer = "^0.4.0" | |||
omegaconf = "^2.1.1" | |||
holidays = "^0.11.3" | |||
boto3 = "^1.20.23" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to limit with older versions maybe three or more year old, or use "*".
To avoid dependencies problem.
self._current_experiment_folder = self.experiment_folder.joinpath(f"{job_type}_{group}") | ||
self._current_experiment_folder.mkdir() | ||
|
||
def _save_table(self, table: pd.DataFrame, name: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add gzip flag while logger init
etna/loggers/file_logger.py
Outdated
""" | ||
super().__init__() | ||
if not os.path.isdir(experiments_folder): | ||
raise ValueError(f"Folder {experiments_folder} doesn't exist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we can create folder if it doesn't exist without troubles
raise ValueError("You should start experiment before using log_backtest_run or log_backtest_metrics") | ||
filename = f"{name}.json" | ||
with open(self._current_experiment_folder.joinpath(filename), "w") as ouf: | ||
json.dump(dictionary, ouf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should catch exceptions if some objects not serializable
metrics.columns = ["segment"] + columns_name | ||
|
||
try: | ||
self._save_table(metrics, "metrics") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metrics has strange columns for some reasons
segment,segment,MAE,MSE,MAPE,SMAPE,R2
0,segment_1,7.763682014880712,63.082959455152306,70.50005183059184,110.20437418869618,-27.28124460927375
1,segment_2,6.169644589069107,39.57608243305653,3.397196323227341e+17,165.08659522722252,-20.297610322998022
2,segment_3,4.148008953160193,26.044276348095234,2.2313460234094784e+17,163.44802769939656,-10.475877270195646
3,segment_8,4.1515436266736705,23.73210099641872,33.21865137226771,42.703159606591534,-8.449524500405012
4,segment_0,1.2761743904371403,2.145912681208096,5.64154880927096,5.855982803150363,-1.2013605113854
5,segment_7,4.802545981302062,24.85181975228773,2.956777063102111e+17,90.2986826786166,-21.374242621538958
6,segment_6,4.951371383445379,25.260573194944055,1.6948198758644096e+18,34.42165015381447,-31.770833878003458
7,segment_9,2.7972933788078693,10.25362118755356,1.3365994548979092e+18,24.294594073351483,-3.004909993258983
8,segment_4,2.3343613388059667,7.798571395070086,5.050117708467421,5.19826439721232,-0.4185895718027921
9,segment_5,3.631983273631512,14.911636014941061,1.6744240388728714e+18,19.535007065540476,-19.032146156809308
DATETIME_FORMAT = "%Y-%m-%dT%H-%M-%S" | ||
|
||
|
||
class BaseFileLogger(BaseLogger): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add config in init like in Wandblogger -- to save experiments params and pipeline.
etna/loggers/file_logger.py
Outdated
def _check_bucket(self): | ||
try: | ||
self.s3_client.head_bucket(Bucket=self.bucket) | ||
except ParamValidationError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another error type:
File "/home/marti/projects/etna-1/etna/loggers/file_logger.py", line 308, in _check_bucket
self.s3_client.head_bucket(Bucket=self.bucket)
File "/home/marti/.cache/pypoetry/virtualenvs/etna-hkIKRVwg-py3.8/lib/python3.8/site-packages/botocore/client.py", line 391, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/marti/.cache/pypoetry/virtualenvs/etna-hkIKRVwg-py3.8/lib/python3.8/site-packages/botocore/client.py", line 719, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadBucket operation: Not Found
And that's why proper ValueError
is not raised
etna/loggers/file_logger.py
Outdated
# create subfolder for current experiment | ||
cur_datetime = datetime.datetime.now() | ||
subfolder_name = cur_datetime.strftime(DATETIME_FORMAT) | ||
self.experiment_folder = pathlib.Path(self.experiments_folder).joinpath(subfolder_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to pathlib.Path(self.experiments_folder).resolve()
to get full path of experiment
etna/loggers/file_logger.py
Outdated
group: | ||
Specify a group to organize individual runs into a larger experiment. | ||
""" | ||
self._current_experiment_folder = self.experiment_folder.joinpath(f"{job_type}_{group}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If job_type=None and group=None I'll get invalid folder name
etna/loggers/file_logger.py
Outdated
raise e | ||
|
||
# case for aggregate_metrics=False | ||
if "fold_number" in metrics_df.columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it in external function that is common for all loggers?
…nfigs, add clarifications to start_experiment
etna/loggers/file_logger.py
Outdated
# create subfolder for current experiment | ||
cur_datetime = datetime.datetime.now() | ||
subfolder_name = cur_datetime.strftime(DATETIME_FORMAT) | ||
self.experiment_folder = os.path.join(experiments_folder, subfolder_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should change this path concat too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇
IMPORTANT: Please do not create a Pull Request without creating an issue first.
Before submitting (must do checklist)
Type of Change
Proposed Changes
Look #344.
Related Issue
#344.
Closing issues
Closes #344.