Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML Text Classification #289

Closed
slavakurilyak opened this issue May 12, 2020 · 2 comments
Closed

HTML Text Classification #289

slavakurilyak opened this issue May 12, 2020 · 2 comments
Assignees

Comments

@slavakurilyak
Copy link

Is your feature request related to a problem? Please describe.
Every page on the world wide web (WWW) is an HTML page, which offers the potential to use tags as metadata. Today, data labelers are forced to rely on text when performing text classification.

Describe the solution you'd like
As a data labeler, I would like to perform text classification on HTML files, similar to HTML NER Tagging.

Describe alternatives you've considered
I'm assuming that NER models are similar to text classification models

Additional context
None

@niklub
Copy link
Collaborator

niklub commented May 13, 2020

Hi, @slavakurilyak !

You can easily do an HTML classification by applying the following config

<View>
  <Choices name="choice" toName="text">
    <Choice value="Class A"></Choice>
    <Choice value="Class B"></Choice>
  </Choices>
  <HyperText name="text" value="$text"></HyperText>
</View>

Do you suggest it should be added to the templates list?

@slavakurilyak
Copy link
Author

@niklub Yes, please do!

Tracking clicks and views on templates and playground will allow the Heartex Labs to better understand the popularity of this template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants