Split main #14

mmatera · 2023-02-20T21:09:50Z

Just a first round. The documentation was minimally completed. The criteria to split the modules were

Following WR guides.
each module depends on the minimal number of Python libraries backends.
Auxiliary functions moved in separated modules, with a no_doc=True attribute.

pymathics/natlang/translation.py

pymathics/natlang/linguistic_data.py

rocky · 2023-02-20T22:57:34Z

pymathics/natlang/linguistic_data.py

+
+    summary_text = "retrieve a list of common words"
+
+    def eval(self, evaluation: Evaluation, options: dict):


For this function, I am seeing

RecursionError: maximum recursion depth exceeded while calling a Python object

Maybe we comment it out for now?

In my machine and in the CI it seems to work. Do you have a test case which raise this error?

Did you try the first Form WordList[] which is what I believe this particular eval method handles? There is no CI test for this, so unless you ran this explicitly you might not know it fails.

Also, a principle mentioned in the guide says that every builtin should have at least one example so that might lead you to a problem as well if there is one.

I tried with WordList[]//Length . Now I added this as a doctest.

And yes, the CI is still clean.

I tried with WordList[]//Length . Now I added this as a doctest.

Ok - good. Details are important, giving these up front saves time.

Yes, WordList[]//Length works for me as well. But WordList[] by itself does not and gives me the recursion error.

This is something we should point out.

Also:

Length[WordList[]] > 10000

is not not something that a user is going to be interested in. This feels like something that belongs as a pytest instead. (The distinction between user example and doctest was also mentioned in the guide.)

See the section under Applications in https://wolfram.com/xid/0cq048d2-ov2unu for something that is more akin to a user-oriented example.

WordList[]; does not produce the issue either. So, it seems to be a problem in showing large lists.

WordList[] also works in my system, but takes time to print the result.

pymathics/natlang/normalization.py

pymathics/natlang/linguistic_data.py

rocky · 2023-02-20T23:13:41Z

pymathics/natlang/normalization.py

+from mathics.core.list import ListExpression
+
+from pymathics.natlang.spacy import _cases, _pos_tags, _position, _SpacyBuiltin
+


We need a Python statement like:

sort_order = "Text normalization"

Currently in the summary list we have:

Linguistic Data

Text normalization

Text analysis functions

Language translation

"Text Normalization" appears before "Language translation" because of the order of the modules: "normalization" appears before "translation".

Similarly for the other modules.

pymathics/natlang/textual_analysis.py

pymathics/natlang/translation.py

rocky · 2023-02-20T23:30:14Z

Over all, I like this, very much. This is a big improvement over what came before!

Thanks for undertaking this.

pymathics/natlang/linguistic_data.py

pymathics/natlang/translation.py

pymathics/natlang/linguistic_data.py

pymathics/natlang/textual_analysis.py

rocky · 2023-02-22T05:51:49Z

pymathics/natlang/textual_analysis.py

+
+    ## Problem with import for certain characters in the text.
+    ## >> text = Import["ExampleData/EinsteinSzilLetter.txt"];
+    >> text = "I have a dairy cow, it's not just any cow. \


Wrapping does not render right for "\" in test cases. Look at the Dango rendering here.

It is a pity. Now I removed the extra spaces just by eliminating the spaces from the left indent. I also noticed that in Django, linebreaks are ignored when strings are shown.

pymathics/natlang/normalization.py

rocky · 2023-02-22T05:59:42Z

I think we are pretty close.

mmatera added 3 commits February 20, 2023 17:49

split main in modules

f840599

adding comments

f5a4282

black

63e72dd

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/translation.py Outdated Show resolved Hide resolved

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/linguistic_data.py Outdated Show resolved Hide resolved

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/linguistic_data.py Outdated Show resolved Hide resolved

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/normalization.py Outdated Show resolved Hide resolved

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/linguistic_data.py Outdated Show resolved Hide resolved

fix summaries

29add1a

mmatera force-pushed the split_main branch from 36a2723 to 29add1a Compare February 20, 2023 23:07

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/textual_analysis.py Outdated Show resolved Hide resolved

rocky reviewed Feb 20, 2023

View reviewed changes

pymathics/natlang/translation.py Show resolved Hide resolved

mmatera added 3 commits February 21, 2023 23:30

rocky's comments fixed

2b10a3c

black

042dfe3

test for wordlist

899d0bf

mmatera commented Feb 22, 2023

View reviewed changes

pymathics/natlang/linguistic_data.py Outdated Show resolved Hide resolved