Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/wandb logging #218

Merged
merged 44 commits into from
Apr 6, 2022
Merged

Feature/wandb logging #218

merged 44 commits into from
Apr 6, 2022

Conversation

asawczyn
Copy link
Collaborator

@asawczyn asawczyn commented Feb 17, 2022

additionally close #152, but there is still a problem with the flair hps pipelines e.g. embeddings/pipeline/flair_hps_pipeline.py:139

an example hps logging output:
https://wandb.ai/albert__/test-hps-run-3
for source code:

from embeddings.hyperparameter_search.lighting_configspace import (
    LightingTextClassificationConfigSpace,
)
from embeddings.hyperparameter_search.parameters import ConstantParameter
from embeddings.pipeline.lightning_classification import LightningClassificationPipeline
from embeddings.pipeline.lightning_hps_pipeline import OptimizedLightingClassificationPipeline
from embeddings.utils.loggers import LightningLoggingConfig

LightningClassificationPipeline.DEFAULT_DATAMODULE_KWARGS.update(
    {
        "downsample_train": 0.03,
        "downsample_val": 0.03,
        "downsample_test": 0.03,
    }
)
WANDB_PROJECT = "test-hps-run-3"

for embedding_name in ["allegro/herbert-base-cased", "hf-internal-testing/tiny-albert"]:
    config_space = LightingTextClassificationConfigSpace(
        embedding_name_or_path=embedding_name,
        finetune_last_n_layers=ConstantParameter("finetune_last_n_layers", 0),
        max_epochs=ConstantParameter("max_epochs", 5),
    )
    pipeline = OptimizedLightingClassificationPipeline(
        config_space=config_space,
        dataset_name_or_path="clarin-pl/polemo2-official",
        input_column_name="text",
        target_column_name="target",
        logging_config=LightningLoggingConfig.from_flags(
            wandb=True,
            tracking_project_name=WANDB_PROJECT,
        ),
        n_trials=3,
    ).persisting(
        best_params_path="best_params.yaml",
        log_path="hps_log.pickle",
    )
    df, metadata = pipeline.run(run_name=f"search-{embedding_name}")

    pipeline = LightningClassificationPipeline(
        output_path=".",
        logging_config=LightningLoggingConfig.from_flags(
            wandb=True,
            tracking_project_name=WANDB_PROJECT,
        ),
        **metadata,
    )
    pipeline.run(run_name=f"best-params-retrain-{embedding_name}")

@netlify
Copy link

netlify bot commented Feb 17, 2022

Deploy Preview for embeddingsclarinpl canceled.

Name Link
🔨 Latest commit 01f5b41
🔍 Latest deploy log https://app.netlify.com/sites/embeddingsclarinpl/deploys/624d87a14d32410008b75f5d

@asawczyn asawczyn temporarily deployed to Test deployment March 3, 2022 16:26 Inactive
@asawczyn asawczyn marked this pull request as ready for review March 3, 2022 16:38
@asawczyn asawczyn temporarily deployed to Test deployment March 4, 2022 14:52 Inactive
@asawczyn asawczyn temporarily deployed to Test deployment March 9, 2022 13:35 Inactive
embeddings/utils/utils.py Outdated Show resolved Hide resolved
@asawczyn asawczyn temporarily deployed to Test deployment April 1, 2022 16:47 Inactive
embeddings/data/datamodule.py Show resolved Hide resolved
embeddings/pipeline/hps_pipeline.py Outdated Show resolved Hide resolved
embeddings/pipeline/hps_pipeline.py Show resolved Hide resolved
embeddings/pipeline/lightning_hps_pipeline.py Show resolved Hide resolved
embeddings/pipeline/lightning_pipeline.py Outdated Show resolved Hide resolved
embeddings/utils/utils.py Show resolved Hide resolved
examples/evaluate_lightning_sequence_labeling.py Outdated Show resolved Hide resolved
@asawczyn asawczyn temporarily deployed to Test deployment April 5, 2022 15:16 Inactive
embeddings/pipeline/hps_pipeline.py Show resolved Hide resolved
embeddings/utils/loggers.py Show resolved Hide resolved
@asawczyn asawczyn temporarily deployed to Test deployment April 5, 2022 19:22 Inactive
laugustyniak
laugustyniak previously approved these changes Apr 6, 2022
ktagowski
ktagowski previously approved these changes Apr 6, 2022
@asawczyn asawczyn dismissed stale reviews from ktagowski and laugustyniak via 01f5b41 April 6, 2022 12:29
@asawczyn asawczyn temporarily deployed to Test deployment April 6, 2022 13:29 Inactive
@ktagowski ktagowski merged commit faca081 into main Apr 6, 2022
@ktagowski ktagowski deleted the feature/wandb_logging branch April 6, 2022 14:53
mkossakowski19 pushed a commit that referenced this pull request May 12, 2022
* refactor: move default kwargs

* feat(logging): add args for logging

* feat(logging): update pyproject and tests

* feat(logging): add logging to seq labelling

* fix: fixes after rebase

* feat(logging): add naming of run in hps

* test: repair hps tests

* add dataset info to logging

* ci(poetry): lock dependencies

* feat(logging): add evaluation.json as artifact to wandb

* fix: fix artifacts naming

* style: fix formatting

* feat: make saving params more unified

* feat: add dataset version to hparams

* feat(logging): organize artifacts

* feat(scripts): add hps examples

* fix(scripts): fix document classification script

* fix: fix lack of test metrics

* feat: add logits to evaluation.json

* fix: apply softmax to logits

* fix: fix test stem in seq labelling

* refactor(tests): refactor result path in tests

* feat(sequence-labelling): add names and probabilities to returned dict

* test: disable mkdirs during tests

* fix: fix sequence labelling test

* fix: lock dependencies

* fix: froze click

* refactor: refactor due to comments in PR

* feat: add target names to text classification

* style: format code

* test: fix tests and add missing

* refactor: refactor due to pr comments

* feat(logging): add LightningLoggingConfig

* test: fix hps test

* fix: fix literal type import

* test: fix hps test

* fix: fix list type

* ci: update lock

* fix: artifacts naming

* refactor: edit default exceptions in hps

* refactor: move mkdir

* refactor: add logger wrappers

* refactor: move get_lightning_loggers to loggers config

* refactor: refactor due to pr comments

Co-authored-by: Albert Sawczyn <albert.sawczyn@pwr.edu.pl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix mypy issue in embeddings/pipeline/lighting_hps_pipeline.py
4 participants