Skip to content

Latest commit

 

History

History
58 lines (40 loc) · 1.03 KB

features.rst

File metadata and controls

58 lines (40 loc) · 1.03 KB

Feature Extraction

HTML Tokenization

webstruct.html_tokenizer

HtmlToken

HtmlTokenizer

Feature Extraction Utilitites

webstruct.feature_extraction

HtmlFeatureExtractor

Predefined Feature Functions

webstruct.features

webstruct.features.token_features

webstruct.features.data_features

webstruct.features.block_features

webstruct.features.global_features

Gazetteer Support

webstruct.gazetteers.features

webstruct.gazetteers.geonames