Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Partial matching of strings #57

Open
andrei-volkau opened this issue May 13, 2021 · 2 comments
Open

[question] Partial matching of strings #57

andrei-volkau opened this issue May 13, 2021 · 2 comments

Comments

@andrei-volkau
Copy link

Goal: The goal is to group the following strings into the same group.

Should you raise an adverse event in a specific patient?
Although what you say will be treated in confidence, should you raise an adverse event in a specific patient?

The following code creates separate groups for those strings.

string_grouper = StringGrouper(question_table["question"])
string_grouper = string_grouper.fit()
question_table["labels"] = string_grouper.get_groups()

Question: is it possible to adjust string matching to reach the goal?

Thank you in advance for any hints!

@ParticularMiner
Copy link
Contributor

Hi @andrei-volkau

Simply lower the similarity-threshold (the default is 0.8). For example, you could try the following:

string_grouper = StringGrouper(question_table["question"], min_similarity=0.5)
string_grouper = string_grouper.fit()
question_table["labels"] = string_grouper.get_groups()

Continue lowering if it doesn't work.

Goal: The goal is to group the following strings into the same group.

Should you raise an adverse event in a specific patient?

Although what you say will be treated in confidence, should you raise an adverse event in a specific patient?

The following code creates separate groups for those strings.


string_grouper = StringGrouper(question_table["question"])

string_grouper = string_grouper.fit()

question_table["labels"] = string_grouper.get_groups()

Question: is it possible to adjust string matching to reach the goal?

Thank you in advance for any hints!

@ParticularMiner
Copy link
Contributor

Notify: @andrei-volkau

There are a few more options described in the README (follow the link).

Hi @andrei-volkau

Simply lower the similarity-threshold (the default is 0.8). For example, you could try the following:

string_grouper = StringGrouper(question_table["question"], min_similarity=0.5)
string_grouper = string_grouper.fit()
question_table["labels"] = string_grouper.get_groups()

Continue lowering if it doesn't work.

Goal: The goal is to group the following strings into the same group.

Should you raise an adverse event in a specific patient?

Although what you say will be treated in confidence, should you raise an adverse event in a specific patient?

The following code creates separate groups for those strings.


string_grouper = StringGrouper(question_table["question"])

string_grouper = string_grouper.fit()

question_table["labels"] = string_grouper.get_groups()

Question: is it possible to adjust string matching to reach the goal?

Thank you in advance for any hints!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants