Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

29 test category mapping techniques #37

Merged
merged 21 commits into from
Jun 24, 2024

Conversation

cmbrennan002
Copy link
Contributor


Description

This is a draft PR, created to hold work over between sprints. So far, the code:

  • Expands the keyword search, then uses keywords to extract all sentences with a particular subcategory
  • Create phrases for matching from these sentences (manually)
  • Calculates the cosine similarity between all phrases labelled as 'benefits' in the binary labelled dataset, and the phrases for each subcategory

Next steps:

  • Iterate through phrases to create cosine similarity for the full dataset
  • Use the max score, and average score, to predict final category
  • Increase range of categories
  • Improve efficiency of scores
  • Shift to using benefit classifier output (rather than currently the labelled input)

Fixes # (issue)

Instructions for Reviewer

In order to test the code in this PR you need to ...

Please pay special attention to ...

Checklist:

  • I have refactored my code out from notebooks/
  • I have checked the code runs
  • I have tested the code
  • I have run pre-commit and addressed any issues not automatically fixed
  • I have merged any new changes from dev
  • I have documented the code
    • Major functions have docstrings
    • Appropriate information has been added to READMEs
  • I have explained this PR above
  • I have requested a code review

@RFOxbury RFOxbury marked this pull request as ready for review June 24, 2024 08:30
@RFOxbury RFOxbury merged commit de74a3c into dev Jun 24, 2024
@RFOxbury RFOxbury deleted the 29_test_category_mapping_techniques branch June 24, 2024 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants