Codeblock-scoped `# optional` annotations for doctests (modularization) #35750

mkoeppe · 2023-06-09T16:53:57Z

They would enable us to:

keep doctest annotations sufficiently fine-grained for the purposes of test coverage in modularized tests
reduce the visual clutter in source code and formatted documentation
reduce the maintenance effort (in particular when using primitive text editors).

See

https://groups.google.com/g/sage-devel/c/utA0N1it0Eo/m/RMUBMbtbAgAJ and references within (sage-devel, 2023-06)
Discussion on naming of the keyword for standard features: https://groups.google.com/g/sage-devel/c/OUnoroIf0qc (sage-devel, 2023-06)

Switching from the current line-level annotations to block-level annotations (where it makes sense) can be easily automated. We should include such an automation tool in Sage. It would complement the other development tools provided to assist with modularization tasks. That's:

Doctest # optional / # long / # needs annotation maintenance tool #35790

See also:

PDF documentation: Replace # optional by footnote symbols #35726

Part of:

Meta-ticket: Modularize sagelib into separate distributions (pip-installable packages) sagemath-... for Sage 10.x #29705

The text was updated successfully, but these errors were encountered:

kwankyu · 2023-06-14T07:10:45Z

How about this?

        Persistent optional tags work::

            sage: # long time
            sage: QQbar(I)^10000
            1
            sage: QQbar(I)^10000  # not tested
            I

            sage: # optional - sage.rings.finite_rings
            sage: GF(7)
            Finite Field of size 7
            sage: GF(10)  # not tested
            Traceback (most recent call last):
            ...
            ValueError: the order of a finite field must be a prime power

A persistent optional tag is triggered by a line sage: # ... (note empty code). Then it is effectively added at the end of each line of the block following the tag.

This seems better (and easy) than adding the tag to the start of a Sphinx directive ::, which I originally suggested.

It is easy since I could implement the persistent optional tag by less than 10 lines :-)

mkoeppe · 2023-06-15T16:08:13Z

I think it would be good if the syntax was explicit about affecting a block rather than a line.
For example:

            sage: # doctest block: long time
            sage: QQbar(I)^10000
            1
            sage: QQbar(I)^10000  # not tested
            I

            sage: # doctest block: optional - sage.rings.finite_rings
            sage: GF(7)
            Finite Field of size 7
            sage: GF(10)  # not tested
            Traceback (most recent call last):
            ...
            ValueError: the order of a finite field must be a prime power

And I'm not sure that a simple blank line should end a block in this sense. For other purposes, these blank lines have no syntactic meaning. So perhaps it should be

But overall these details don't matter much to me, as long as it can be kept in a format that makes writing automatic code maintenance scripts not too hard.

Probably you'll want feedback from people who participated in the discussion in https://groups.google.com/g/sage-devel/c/utA0N1it0Eo; @kcrisman @tscrim @tobiasdiez

tobiasdiez · 2023-06-15T16:22:58Z

xdoctest handles this as follows: https://xdoctest.readthedocs.io/en/latest/xdoctest.directive.html

# An inline directive appears on the same line as a command and

# only applies to the current line.

raise AssertionError('this will not be run (a)')  # xdoctest: +SKIP

print('This line will print: (A)')

print('This line will print: (B)')

# However, if a directive appears on its own line, then it applies

# too all subsequent lines.

# xdoctest: +SKIP()

raise AssertionError('this will not be run (b)')

print('This line will not print: (A)')

# Note, that SKIP is simply a state and can be disabled to allow

# the program to continue executing.

# xdoctest: -SKIP

print('This line will print: (C)')

print('This line will print: (D)')

# This applies to inline directives as well

# xdoctest: +SKIP("an assertion would occur")

raise AssertionError('this will not be run (c)')

print('This line will print: (E)')  # xdoctest: -SKIP

raise AssertionError('this will not be run (d)')

# xdoctest: -SKIP("a reason can be given as an argument")

print('This line will print: (F)')

mkoeppe · 2023-06-15T16:28:50Z

For cross-reference:

Support xdoctest's directives in the Sage doctester #35194

kwankyu · 2023-06-15T23:37:26Z

I think it would be good if the syntax was explicit about affecting a block rather than a line.

These optional tags already require the reader's understanding of the mechanism of optional tags. Then it is not a big deal that users learn about "persistent optional tags" in addition, which I think is quite natural and self-evident that it is a block tag (it is already a coding practice to add a comment above the relevant code block) . Once learnt, "doctest block:" would look just another cluttering.

But overall these details don't matter much to me, as long as it can be kept in a format that makes writing automatic code maintenance scripts not too hard.

The script would convert

sage: line1 # optional - sage.modules
sage: line2 # optional - sage.modules
<BLANKLINE>

to

<BLANKLINE>
sage: # optional - sage.modules
sage: line1
sage: line2
<BLANKLINE>

and

sage: line0
sage: line1 # optional - sage.modules
sage: line2 # optional - sage.modules
<BLANKLINE>

to

sage: line 0
<BLANKLINE>
sage: # optional - sage.modules
sage: line1
sage: line2
<BLANKLINE>

kwankyu · 2023-06-15T23:47:33Z

Given that "persistent optional tag" is very light weight (it takes 10 lines of code), xdoctest seems too much as a solution to our problem. Whether we need the full power of xdoctest in sage is another matter.

DaveWitteMorris · 2023-06-16T01:59:49Z

Matthias wrote on sage-devel:
Fine with me to introduce something like "# module - sage.groups" with identical semantics as "# optional - sage.groups". I'd suggest to take such discussions to #35750

It seems to me that this would be more self-explanatory if we use something like "# requires sage.groups" or "# requires import of sage.groups"

kwankyu · 2023-06-16T02:28:21Z

# optional - has a big merit that it has long been used for doctests depending on optional packages as in # optional - mathematica. Matthias just extended its use to distribution packages (# optional - sage.modules).

I don't think using just different phrases does not add much but lose the merit.

DaveWitteMorris · 2023-06-16T02:59:06Z

I agree that # optional has worked well for doctests, especially since packages are classified as "optional". But for examples in documentation that an end-user will see, I think something like # requires ... is much more helpful. It is saying that if you want to do the example yourself, then you need to havesage.modules. I think that almost anyone who sees requires <whatever> will realize that they need to figure out a way to get <whatever>. (Maybe a different phrase would be even better.) But I don't think that the tag optional <whatever> conveys the same message to a non-expert. Indeed, I don't think I would understand what it meant if I weren't already involved with sage development.

mkoeppe · 2023-06-16T03:40:40Z

How about # standard - sage.groups then

kwankyu · 2023-06-16T03:42:01Z

I think that almost anyone who sees requires <whatever> will realize that they need to figure out a way to get <whatever>.

Right. So that is more problematic in a sense because in reality they don't need to do anything to run the doctests since <whatever> is already in sage. Those tags are really for the (yet imaginary) users of selected pieces of sage.

kwankyu · 2023-06-16T03:48:37Z

So I think it is more important now that we hint (or educate) the user that they just can ignore the optional tags, for which "optional" seems better than "requires" (which scares a user :-)

DaveWitteMorris · 2023-06-16T04:09:33Z

You are right that the word "requires" is not helpful for users of the distribution (although I think it is good for the "(yet imaginary) users"). Maybe the word "uses" would be better? For a user of the distribution, this is just a comment about what is being used, but for a user who is importing a-la-carte, this is telling them what they need. There may well be an even better word.

Following up on Matthias's suggestion, it would be more informative to say something like # uses sage.groups (a standard module) and sage.mathematica (an optional module), but I would guess we all agree that is too unwieldy.

Perhaps part of our nonagreement is that I am only talking about examples, not all doctests. My comments are intended to be directed at what will show up in the (printed or html or ...) documentation, so I am not trying to say anything about code that appears in a TEST:: block. In my way of thinking, doctests are for developers, but I hope that examples in the documentation can be clear to novice users, with as little special education as possible. Perhaps I'm not being realistic.

kwankyu · 2023-06-16T04:14:44Z

We have now the following candidates:

(o) optional # optional - sage.groups (current)
(m) module # module - sage.groups
(s) standard # standard - sage.groups
(r) requires # requires sage.groups
(u) uses # uses sage.groups
(n) needs # needs sage.groups
(a) assumes # assumes sage.groups

I added the last one.

Anyway, this is orthogonal with the idea of persistent tags I suggested before.

mathzeta · 2023-06-16T09:39:39Z

If those annotations correspond to actual modules, then other candidates are # import sage.groups and # imports sage.groups. Then it is easy to see how they can be used on a line of their own, instead of codeblock-scoped.

kwankyu · 2023-06-16T10:31:59Z

No. It just signals that the distribution package providing the feature named sage.groups (which of course means group functionality) should be installed for the doctests to work.

kwankyu · 2023-06-16T10:54:45Z

+1 to (n). It reads well.

tobiasdiez · 2023-06-16T13:49:15Z

Would it work to explicitly import things from other distributions (not as an comment, but as a direct import), then analyze the imports and if they point to a different distribution then ignore the lines following the import?

This would also have the huge advantage to support users that want to use sage as a normal python package without loading the sage environment.

mkoeppe · 2023-06-16T15:03:45Z

Would it work to explicitly import things from other distributions (not as an comment, but as a direct import), then analyze the imports and if they point to a different distribution then ignore the lines following the import?

I think we would be able to generate import statements from the # optional annotations.

The necessary cross-referencing information is going to be in sage.features.sagemath.
See:

Add style guide / reference for # optional - sage.... doctest tags, extend sage -t and sage -fixdoctests for modularization tasks #35749 (needs review)
(doc preview: https://deploy-preview-35749--sagemath-tobias.netlify.app/reference/misc/sage/features/sagemath.html#sage.features.sagemath.all_features)

This would also have the huge advantage to support users that want to use sage as a normal python package without loading the sage environment.

+1.

I could imagine that the code blocks in the formatted HTML could offers tabs for different views, as is done for example in https://setuptools.pypa.io/en/latest/userguide/package_discovery.html: The example in monolithic Sage (goes through preparser, no imports), and the example in plain Python (some prettified version of the preparsed example + imports). This would also address @williamstein's comment (https://groups.google.com/g/sage-devel/c/utA0N1it0Eo/m/79aD4-RQAAAJ)

kwankyu · 2023-06-17T00:55:19Z

I chose "needs" as an alternative to "optional" in the linked PR. Let me know if you object.

"needs" is from https://groups.google.com/g/sage-devel/c/lb1Eq8o0a5I/m/dwofof9dAAAJ

kwankyu · 2023-06-17T07:38:16Z

One problem with "needs" (and other phrases as well in this matter except "optional") is that people may use the word in normal comments. One instance was revealed in the linked PR (there is a comment "# needs 2 period").

Instead of trying to maker our parser smarter, I think it is easy and safer to use more formal "needs:" (note the colon at the end) for our purpose. So # needs: sage.groups. Any idea?

kwankyu · 2023-06-17T13:13:03Z

I decided to go with "needs" (colon or hyphen can follow optionally though), revising the problematic doctests.

DaveWitteMorris · 2023-06-21T05:22:18Z

I think this is much better than "optional", so I do not object. However, "needs" is almost synonymous with "requires" in this context, so I'm surprised that you chose it. I think "uses" would be better, but I do not feel strongly and do not object.

kwankyu · 2023-06-21T15:22:52Z

I think this is much better than "optional", so I do not object.

Thanks.

However, "needs" is almost synonymous with "requires" in this context, so I'm surprised that you chose it.

People seemed to want something more direct than "optional" in that context, and "needs" sounds (to me yes) gentle and soft compared with other candidates.

I also realized, by Matthias' "standard" candidate, that "optional" in that context is not quite right since "sage.groups" is not as "optional" as say "mathematica".

mkoeppe · 2023-06-21T15:36:49Z

Yes, that's exactly why I suggested "standard" - which continues to have the same technical meaning in the modularized setting as in the Sage distribution

DaveWitteMorris · 2023-06-21T17:18:25Z

I think the word "standard" is short for something like "requires the following standard module" (or "assumes the following standard module has been imported"). My objection is that I don't think an uneducated user will understand this shorthand, and it seems to me that the word "needs" (or "requires" or "uses") is more self-explanatory -- it seems to me that it is better at getting across the main idea without requiring further explanation. (And I like the word "uses" in the distribution context, because it sounds more like a comment that can be ignored, whereas "needs" and "requires" would seem to indicate that something needs to be done.)

By the way, if "uses" (or a similar word) is chosen, then perhaps something like "uses module:" (or "uses modules:" when there is more than one) would be better if it is ok to have more than one word here. Specifically, it would pretty much eliminate the possibility of random comments that happen to have the form picked up by the doctester.

I may be wrong on all of this, but thought I should share my opinions in case they are useful. My main point is that, for the reason in my first paragraph, I think "needs"/"requires"/"uses" (or something similar), is better than "standard" and "optional" for the user documentation. Choosing between "needs"/"requires"/"uses" may be bikeshedding, so I really don't want to push on that.

soehms · 2023-06-21T21:47:21Z

And I like the word "uses" in the distribution context, because it sounds more like a comment that can be ignored, whereas "needs" and "requires" would seem to indicate that something needs to be done.

+1

By the way, if "uses" (or a similar word) is chosen, then perhaps something like "uses module:" (or "uses modules:" when there is more than one) would be better if it is ok to have more than one word here. Specifically, it would pretty much eliminate the possibility of random comments that happen to have the form picked up by the doctester.

I think checking that the words after the keyword are the names of features would be secure enough.

kwankyu · 2023-06-22T15:34:38Z

There are 5 doctest instances that have dangling comments starting with "# uses ..." while there was 1 doctest instance that has a dangling comment starting with "# needs ...", which was fixed by #35779. Hence "needs" is slightly safer than "uses" as a tag.

"# needs sage.groups" really means that the doctest needs the distribution package installed that provides the feature named "sage.groups". It is somewhat misleading to say that the doctest "uses" the module (python module or python package) "sage.groups".

soehms · 2023-06-22T17:03:55Z

There are 5 doctest instances that have dangling comments starting with "# uses ..." while there was 1 doctest instance that has a dangling comment starting with "# needs ...", which was fixed by #35779. Hence "needs" is slightly safer than "uses" as a tag.

"# needs sage.groups" really means that the doctest needs the distribution package installed that provides the feature named "sage.groups". It is somewhat misleading to say that the doctest "uses" the module (python module or python package) "sage.groups".

I agree from the point of view of a developer. But from the point of view of a user who sees this in the documentation, the other choice would be less irritating.

mkoeppe · 2023-06-22T17:30:31Z

How about in the block-scoped annotations, we spell it out as # uses standard feature - sage.groups, whereas in the line annotations we use the shortening # standard - sage.groups.

The maintenance script for rewriting annotations (wishlist item) can bring annotations to this standard form

The keyword recognized would just be standard, the rest would be ignored noise

kwankyu · 2023-06-22T20:39:34Z

... Choosing between "needs"/"requires"/"uses" may be bikeshedding, so I really don't want to push on that.

There is a reason why the term bikeshedding was invented. It is so common. In our case, the problem is not trivial as all developers and users would use and see "needs", "requires", "uses", etc.

Hence to settle down this problem rapidly without spending too much time for discussions and discussions, let's accept that we really care about "bikeshedding".

I would go straight for a sage-devel voting with the following 4 candidates:

(A) "needs"

sage: # needs sage.rings.number_field
sage: QQbar(I)^2
-1
sage: QQbar(I)^1000000000 # long time
1

sage: # needs sage.rings.finite_rings
sage: F = GF(7)
sage: F(1) + QQbar(1)  # needs sage.rings.number_field
...

(B) "requires"

sage: # requires sage.rings.number_field
sage: QQbar(I)^2
-1
sage: QQbar(I)^1000000000  # long time
1

sage: # requires sage.rings.finite_rings
sage: F = GF(7)
sage: F(1) + QQbar(1)  # requires sage.rings.number_field
...

(C) "uses"

sage: # uses sage.rings.number_field
sage: QQbar(I)^2
-1
sage: QQbar(I)^1000000000  # long time
1

sage: # uses sage.rings.finite_rings
sage: F = GF(7)
sage: F(1) + QQbar(1)  # uses sage.rings.number_field
...

(D) "standard"

sage: # use standard feature - sage.rings.number_field
sage: QQbar(I)^2
-1
sage: QQbar(I)^1000000000  # long time
1

sage: # use standard feature - sage.rings.finite_rings
sage: F = GF(7)
sage: F(1) + QQbar(1)  # standard - sage.rings.number_field
...

Do you agree?

kwankyu · 2023-06-22T20:51:15Z

Don't worry about minor details or punctuations. As Matthias said, the keyword would be recognized, the rest would be ignored noise.

But for the voting, let me know what to edit to make your favorite candidate look best.

DaveWitteMorris · 2023-06-22T21:11:03Z

Having a vote is fine with me, but I think it might be a good idea to have a comment period on sage-devel before calling the vote, because someone may want to suggest another candidate.

Is there a mistake in the second example of (D)? I don't think "uses" should appear after the hyphen.

I don't understand the # not tested examples. The answer is not correct, so what are they supposed to illustrate? I guess that will be part of the explanation of the vote.

mkoeppe · 2023-06-22T21:32:21Z

I agree, other examples for the # not tested annotation, for example very long running time, would better

williamstein · 2023-06-22T23:06:28Z

I like "needs" because "uses" and "requires" are very close to the keywords in Julia and Javascript for importing modules, whereas "needs" doesn't resemble the package import keyword for any programming language (according to a little search I just did). Thus using "needs" could avoid confusion.

kwankyu · 2023-07-07T22:44:43Z

Supported by the voting result

https://groups.google.com/g/sage-devel/c/MtS2u3VbJEo

we can go ahead with the keyword "needs". I thank you all who participated in the discussion.

mkoeppe added wishlist item c: doctest framework labels Jun 9, 2023

kwankyu mentioned this issue Jun 16, 2023

Implement persistent optional tag #35779

Closed

5 tasks

mkoeppe mentioned this issue Jun 17, 2023

Auto-translate doctests to pure Python, with import statements #35791

Closed

mkoeppe mentioned this issue Jun 16, 2023

Add style guide / reference for # optional - sage.... doctest tags, extend sage -t and sage -fixdoctests for modularization tasks #35749

Merged

5 tasks

vbraun closed this as completed Jul 16, 2023

mkoeppe added this to the sage-10.1 milestone Jul 16, 2023

Codeblock-scoped # optional annotations for doctests (modularization) #35750

Codeblock-scoped # optional annotations for doctests (modularization) #35750

Comments

mkoeppe commented Jun 9, 2023 • edited Loading

kwankyu commented Jun 14, 2023 • edited Loading

mkoeppe commented Jun 15, 2023

tobiasdiez commented Jun 15, 2023

mkoeppe commented Jun 15, 2023

kwankyu commented Jun 15, 2023

kwankyu commented Jun 15, 2023

DaveWitteMorris commented Jun 16, 2023

kwankyu commented Jun 16, 2023

DaveWitteMorris commented Jun 16, 2023

mkoeppe commented Jun 16, 2023

kwankyu commented Jun 16, 2023

kwankyu commented Jun 16, 2023 • edited Loading

DaveWitteMorris commented Jun 16, 2023

kwankyu commented Jun 16, 2023 • edited Loading

mathzeta commented Jun 16, 2023

kwankyu commented Jun 16, 2023

kwankyu commented Jun 16, 2023

tobiasdiez commented Jun 16, 2023

mkoeppe commented Jun 16, 2023

kwankyu commented Jun 17, 2023 • edited Loading

kwankyu commented Jun 17, 2023

kwankyu commented Jun 17, 2023 • edited by mkoeppe Loading

DaveWitteMorris commented Jun 21, 2023

kwankyu commented Jun 21, 2023

mkoeppe commented Jun 21, 2023

DaveWitteMorris commented Jun 21, 2023

soehms commented Jun 21, 2023

kwankyu commented Jun 22, 2023

soehms commented Jun 22, 2023

mkoeppe commented Jun 22, 2023 • edited Loading

kwankyu commented Jun 22, 2023 • edited Loading

kwankyu commented Jun 22, 2023 • edited Loading

DaveWitteMorris commented Jun 22, 2023

mkoeppe commented Jun 22, 2023

williamstein commented Jun 22, 2023

kwankyu commented Jul 7, 2023

Codeblock-scoped `# optional` annotations for doctests (modularization) #35750

Codeblock-scoped `# optional` annotations for doctests (modularization) #35750

mkoeppe commented Jun 9, 2023 •

edited

Loading

kwankyu commented Jun 14, 2023 •

edited

Loading

kwankyu commented Jun 16, 2023 •

edited

Loading

kwankyu commented Jun 16, 2023 •

edited

Loading

kwankyu commented Jun 17, 2023 •

edited

Loading

kwankyu commented Jun 17, 2023 •

edited by mkoeppe

Loading

mkoeppe commented Jun 22, 2023 •

edited

Loading

kwankyu commented Jun 22, 2023 •

edited

Loading

kwankyu commented Jun 22, 2023 •

edited

Loading