Skip to content

Latest commit

 

History

History
38 lines (31 loc) · 1.4 KB

data_utils.rst

File metadata and controls

38 lines (31 loc) · 1.4 KB

data_utils

The :py:mod:`data_utils` module provides needed functions for data loading and parsing

Functions

Classes

.. autofunction:: wikirec.data_utils.input_conversion_dict
.. autofunction:: wikirec.data_utils.download_wiki
.. autofunction:: wikirec.data_utils._process_article
.. autofunction:: wikirec.data_utils._iterate_and_parse_file
.. autofunction:: wikirec.data_utils.parse_to_ndjson
.. autofunction:: wikirec.data_utils._combine_tokens_to_str
.. autofunction:: wikirec.data_utils._clean_text_strings
.. autofunction:: wikirec.data_utils._lower_remove_unwanted
.. autofunction:: wikirec.data_utils._lemmatize
.. autofunction:: wikirec.data_utils._subset_and_combine_tokens
.. autofunction:: wikirec.data_utils.clean

.. autoclass:: wikirec.data_utils.WikiXmlHandler
   :members:
   :private-members: