Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bert extractor class #4

Merged
merged 17 commits into from
Oct 11, 2022
Merged

Conversation

homomorfism
Copy link
Contributor

Implementation of bert feature extractor, followed by our discussion idea.

Didn't tried train model, based on the proposed hypothesis, only implementation

@implausibleDeniability
Copy link
Contributor

To clarify: further work on this feature may be performed by anybody, not necessarily by @homomorfism

src/bert_featurizer.py Outdated Show resolved Hide resolved
Copy link
Member

@DanisAlukaev DanisAlukaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@homomorfism
Copy link
Contributor Author

@implausibleDeniability please check my changes

@homomorfism
Copy link
Contributor Author

Tests are failing, do not merge

@homomorfism
Copy link
Contributor Author

Now tests are ok

Please, do not merge changes: today l am planning to train classifier, based on (untrained) bert features and handcrafted features. And then we will push changes to develop branch.

@implausibleDeniability
Copy link
Contributor

implausibleDeniability commented Oct 4, 2022

Shamil, reviewing many unrelated changes is not that easy =(

Why do you add text features extractor to the branch with bert features extractor? That confuses the review greatly. If it's possible, I would suggest to split the changes to several branches:

  1. Checkout new branch from dev, for instance "text-features-extractor"
  2. Cherry-pick relevant commits, where you add class TextFeaturesExtractor, from the current branch to the branch "text-features-extractor"
  3. Remove irrelevant to the current branch changes (just removing the files, or reverting commits, whatever easier)

You may do this several times if it seems you can decompose the current branch to many brances

src/metrics.py Outdated Show resolved Hide resolved
Base automatically changed from develop to main October 4, 2022 15:03
@DanisAlukaev DanisAlukaev changed the base branch from main to dev October 4, 2022 15:22
src/metrics.py Outdated Show resolved Hide resolved
src/bert_featurizer_solution.py Outdated Show resolved Hide resolved
src/text_cleaning/text_feature_extractor.py Outdated Show resolved Hide resolved
@implausibleDeniability
Copy link
Contributor

Good Spellchecking btw!

@homomorfism
Copy link
Contributor Author

I have added Cross Validation, that generates file with results in each split and overall mean score

image

@DanisAlukaev DanisAlukaev self-requested a review October 11, 2022 18:40
Copy link
Member

@DanisAlukaev DanisAlukaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, big commit. Everything seems fine to me!

@implausibleDeniability
Copy link
Contributor

Заебись, Шамиль =)

@implausibleDeniability implausibleDeniability deleted the feature/bert-feature-extractor branch October 11, 2022 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants