document settings #722

noviluni · 2020-06-25T12:22:56Z

I've been checking the current settings as a previous step to work on this: #630

~~I've been reordering the settings, reordering the settings in the docs, and checking the currently undocumented settings.~~

~~The undocumented settings are:~~
* SKIP_TOKENS_PARSER
* NORMALIZE
* FUZZY

~~I documented the NORMALIZE setting, however, after some checks, it seems that those settings are not documented because:~~
* It doesn't have any sense or use cases to change them (they were created because of legacy reasons).
* Nobody asked for it in issues, etc.

From this, I think that they are not useful at all, and it doesn't have any sense to keep them. For example, the default list in SKIP_TOKENS_PARSER can be set directly in the code, or NORMALIZE=True can be applied always as it is really specific (it only has sense when the checked source (CLDR json and yamls files) is not normalized, but if it's normalized it doesn't work).

~~My proposal is to remove them in the next major version but I need feedback on this. @Gallaecio~~

I documented all the undocumented settings and made some improvements to the documentation structure.

codecov · 2020-06-25T12:28:21Z

Codecov Report

Merging #722 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #722   +/-   ##
=======================================
  Coverage   98.24%   98.24%           
=======================================
  Files         231      231           
  Lines        2569     2569           
=======================================
  Hits         2524     2524           
  Misses         45       45

Impacted Files	Coverage Δ
dateparser/__init__.py	`100.00% <ø> (ø)`
dateparser/conf.py	`100.00% <ø> (ø)`
dateparser/date.py	`99.57% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2908d21...255ae51. Read the comment docs.

noviluni · 2020-06-25T20:48:31Z

Proposal for deleting SKIP_TOKENS_PARSER: #728

Gallaecio · 2020-06-30T16:47:31Z

I think removing them makes sense.

noviluni · 2020-09-08T14:57:55Z

I will wait until merging this: #779
to avoid conflicts

noviluni · 2020-10-27T13:33:41Z

docs/conf.rst

@@ -1,5 +0,0 @@
-Configurations


Removed as it was not used

noviluni · 2020-10-27T13:34:25Z

docs/settings.rst

@@ -0,0 +1,189 @@
+.. _settings:


Added a specific section for the settings (extracted from usage.rst). Multiple error fixes and improvements have been performed.

noviluni · 2020-10-27T13:36:19Z

docs/usage.rst

-The instance of :class:`DateDataParser <dateparser.date.DateDataParser>` reduces the number
-of applicable languages, until only one or no language is left. It
-assumes the previously detected language for all the subsequent dates supplied.
-
-This class wraps around the core :mod:`dateparser` functionality, and by default
-assumes that all of the dates fed to it are in the same language.
-
-.. autoclass:: dateparser.date.DateDataParser
-   :members: get_date_data
-
-.. warning:: It fails to parse *English* dates in the example below, because *Spanish* was detected and stored with the ``ddp`` instance:
-
-    >>> ddp.get_date_data('11 August 2012')
-    {'date_obj': None, 'period': 'day'}
-


This is not true now, so I removed it.

noviluni · 2020-10-27T13:37:14Z

dateparser/conf.py

+    * `SKIP_TOKENS`
+    * `NORMALIZE`
+    * `RETURN_TIME_AS_PERIOD`
+    * `PARSERS`


I added all the missing settings and removed the descriptions because it makes this harder to maintain. For up to date descriptions we have the settings section in docs.

noviluni · 2020-10-27T13:38:07Z

dateparser/date.py

@@ -268,7 +268,7 @@ class DateDataParser:

    :param locales:
        A list of locale codes, e.g. ['fr-PF', 'qu-EC', 'af-NA'].
-        The parser uses locales to translate date string.
+        The parser uses only these locales to translate date string.


Clarified because of the comment here: #789 (comment)

Gallaecio

Nice!

document settings

fabab5c

noviluni requested a review from Gallaecio June 25, 2020 12:23

noviluni changed the title ~~document settings~~ WIP: document settings Jun 25, 2020

noviluni added this to the v1.0.0 milestone Jun 25, 2020

noviluni added the breaking-change label Jun 25, 2020

noviluni mentioned this pull request Jun 25, 2020

delete SKIP_TOKENS_PARSER setting #728

Merged

noviluni mentioned this pull request Jun 26, 2020

wip:settings validation #630

Closed

noviluni removed the breaking-change label Jun 30, 2020

This was referenced Sep 24, 2020

remove FUZZY setting #794

Merged

Improve settings #796

Open

marc added 2 commits September 28, 2020 13:38

remove deprecated

f889535

Merge branch 'master' into document_settings

6009386

noviluni marked this pull request as draft September 28, 2020 11:46

fix typo in

5d37d50

noviluni modified the milestones: v1.0.0, 1.1.0 Oct 6, 2020

noviluni mentioned this pull request Oct 26, 2020

add settings validation #797

Merged

marc added 5 commits October 27, 2020 13:17

Merge branch 'master' into document_settings

f8bd561

improve docs

9139b11

Merge branch 'master' into document_settings

225b370

add mention to SettingValidationError

be98735

improve some other docs

a171267

noviluni commented Oct 27, 2020

View reviewed changes

docs/conf.rst

@@ -1,5 +0,0 @@

Configurations

Copy link

Collaborator Author

noviluni Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as it was not used

noviluni commented Oct 27, 2020

View reviewed changes

noviluni marked this pull request as ready for review October 27, 2020 13:38

noviluni changed the title ~~WIP: document settings~~ document settings Oct 27, 2020

Gallaecio approved these changes Oct 27, 2020

View reviewed changes

fix flake8 issue

255ae51

noviluni merged commit bc0424a into scrapinghub:master Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document settings #722

document settings #722

noviluni commented Jun 25, 2020 •

edited

codecov bot commented Jun 25, 2020 •

edited

noviluni commented Jun 25, 2020 •

edited

Gallaecio commented Jun 30, 2020

noviluni commented Sep 8, 2020

noviluni Oct 27, 2020

noviluni Oct 27, 2020 •

edited

noviluni Oct 27, 2020

noviluni Oct 27, 2020

noviluni Oct 27, 2020

Gallaecio left a comment

document settings #722

document settings #722

Conversation

noviluni commented Jun 25, 2020 • edited

codecov bot commented Jun 25, 2020 • edited

Codecov Report

noviluni commented Jun 25, 2020 • edited

Gallaecio commented Jun 30, 2020

noviluni commented Sep 8, 2020

noviluni Oct 27, 2020

Choose a reason for hiding this comment

noviluni Oct 27, 2020 • edited

Choose a reason for hiding this comment

noviluni Oct 27, 2020

Choose a reason for hiding this comment

noviluni Oct 27, 2020

Choose a reason for hiding this comment

noviluni Oct 27, 2020

Choose a reason for hiding this comment

Gallaecio left a comment

Choose a reason for hiding this comment

noviluni commented Jun 25, 2020 •

edited

codecov bot commented Jun 25, 2020 •

edited

noviluni commented Jun 25, 2020 •

edited

noviluni Oct 27, 2020 •

edited