-
Notifications
You must be signed in to change notification settings - Fork 101
Octavius 2 #678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Octavius 2 #678
Conversation
It's tricky to get the cram tests to play nicely with the two packages with dune 2.7, but upgrading to dune 2.8 will currently break esy. As a compromise the octavius opam file doesn't run the tests, they're all run in the odoc opam file.
bef11e7
to
e0d7094
Compare
While the opam package has been split into octavius and odoc, the esy build remains one. Package build and install steps have been amended for this.
e0d7094
to
0312a90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the parser stays in this repository, I think it should be called odoc-parser
so it's easier to relate packages in Opam and to guess the library name.
{| | ||
((output | ||
(((f.ml (1 0) (1 46)) | ||
(code_block "ocaml env=f1 version>=4.06 " "code goes here")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove the trailing whitespaces ?
{ emit_code_block input None c } | ||
|
||
| "{@" (code_block_meta as m) "[" (code_block_text as c) "]}" | ||
{ emit_code_block input (Some m) c } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will Odoc use this information in the future ? For example, the first word for syntax highlighting. I think we should parse the "language" field separately then.
let raw_markup_target = | ||
([^ ':' '%'] | '%'+ [^ ':' '%' '}'])* '%'* | ||
let code_block_meta = | ||
([^ '['])* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow quotes in case someone wants brackets here ? I propose
{{@ocaml foo=["bar"]}[ ... ]}
, the meta part isocaml foo=["bar"]
and it should allow balanced curly braces.
|
||
let unterminated_code_block_with_meta = | ||
test "{@met"; | ||
[%expect] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test are not running (disabled by a (enabled_if false)
). I've been able to fix the tests: Julow@061b091
Is here a reason to keep them disabled ?
type nestable_block_element = | ||
[ `Paragraph of inline_element with_location list | ||
| `Code_block of string | ||
| `Code_block of string option * string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that there is two fields, I think both of them should have a with_location
.
are never rendered in the output *) | ||
|
||
type external_tag = | ||
type ocamldoc_tag = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this reference to ocamldoc
isn't necessary, we can call this tag
instead.
src/parser/warning.mli
Outdated
@@ -0,0 +1,8 @@ | |||
type t = { location : Location.span; message : string } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this file has been moved with changes loses its git history and easy access to its authorship information.
"Anton Bachin <antonbachin@yahoo.com>" | ||
"Leo White <leo@lpw25.net>" | ||
"Jules Aguillon <juloo.dsi@gmail.com>" | ||
"Jon Ludlam <jon@recoil.org>"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonludlam Per our Discord discussion, can you justify this authorship metadata? I could be wrong about this, but I will detail what I see:
-
While Leo has given important feedback that affected and advanced the development of the parser, looking at the commit history of the entire
parser
directory in the branch this PR is from, I don't see any commits from Leo, nor is the parser in any other way based on any code authored by Leo. Both the code and design are my original work (I don't claim that the overall approach is in any way original with respect to the "state of the art" of parsers in general, just this particular parser is not originally based on the work of anyone else listed here except me). -
The next part is more debatable. Looking at the same commit history, I see a large number of important contributions from you and Jules. However, though I did not check absolutely every commit, and skipped many commits that, from their message, seemed to touch the parser only incidentally, I doubt that these contributions amount to authorship. They appear to be maintenance, which is what the
maintainer
field is for. Of course, this is a judgment call, but I saw many commits that added features locally, or were refactorings. While contributing important pieces of original code, these don't amount to authorship at large, in my opinion. For comparison, to illustrate (still just) my opinion in operation, even though I maintained Lwt for four years, and added many features, did a lot of clean-up, rewrotelwt.ml
,lwt.mli
, and several other files completely, and wrote almost all the current tests, I am still not an author of Lwt:authors: [ "Jérôme Vouillon" "Jérémie Dimino" ] maintainer: [ "Raphaël Proust <code@bnwr.net>" "Anton Bachin <antonbachin@yahoo.com>" ]
I am a contributor and a former maintainer. I did not set the original structure of Lwt, nor write any major new modules (though I assigned authorship credit for sublibraries to some people that did). I didn't perform any fundamental rework that could have consituted authorship on the scale of the library (as you have done with odoc's new model, in my understanding).
I don't see that any such thing has happened (yet?) with odoc's parser.
I'm less concerned about the second point than the first, however, which I truly don't understand. My first worry is that it is simply a holdover field from Octavius v1. If that's so, I think it's clear that we can't assign authorship on the basis of renaming a library so that it shares a name with another one :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From an opam-repository perspective, we don't take a strong position on how projects ascribe metadata such as authorship vs maintainers. Thinking about it more now, we could take advantage of the multi-versioning packaging to achieve two goals:
- precise author information for each published version of the package (as suggested by @aantron above), so that the version-specific page on the docs site reflects who the primary author(s) of the package in question are.
- aggregated author information from all versions of the opam package on the main package homepage, which shows the full historical set of authors associated with that package name. This would be attribute those who were associated with the project in the past, but do not have current code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be attribute those who were associated with the project in the past, but do not have current code.
In this particular case, however, the previous authors are not associated with v2 at all (as authors). It's just a completely separate project. It's not even a question of current code — it's not that their code was deleted inside the project, and then replaced by new code. The current parser, as a parser, is a completely separate, fresh project, that is now being renamed to match the name of another project. How are the previous authors associated with the present project in the past? They are associated with a separate, past project, which, confusingly, has the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current parser, compared to the original Octavius, is a fresh design, fresh code, fresh metadata, fresh tests, etc., strarted completely from scratch, with no inspiration from Octavius, and, indeed, in a sort of "opposition" (in design terms) to it. The only relation it has to the original Octavius is that it was meant to fill the same role within the larger odoc project, but as a parser, it is completely new and separate. There is no way in which the authors of the past Octavius can automatically be considered as past authors of the "current" "Octavius," except, of course, unless they actually did contribute as authors to the new one (which they did not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- aggregated author information from all versions of the opam package on the main package homepage, which shows the full historical set of authors associated with that package name.
This might make sense for almost all packages, but, for the reasons above, it does not make sense if we will practice dropping in one project into the name slot of another, as is happening in this PR. The package page, especially if it tends to display the latest version's other metadata, will be misleading if it shows authorship metadata of earlier, completely unrelated versions of the supposedly "same" package, mixed in with current authorship metadata.
The per-version metadata should still display the authors of that version, however, I do agree with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have no way of distinguishing the subtlety of a rewrite in opam
Sorry to spam, but re-reading this... Octavius v2 is not a "rewrite" of Octavius v1. It's just a different project. It's not API-compatible, similar in design, etc. There is no subtlety about it. I understand that opam can't express that it's a new project, but, as already mentioned by me and several people, that seems like a good reason not to shadow Octavius v1 if you're not willing to also be accurate in the project's metadata about the actual origin of the so-called "Octavius v2" (a name I never had in mind).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To give some detail on the API incompatibility, it's not just some minor breaking change, but a completely new API with new data structures (though there is inevitably some similarity due to them representing the same thing — on the other hand, there are fundamental differences like the representation of spacing, EDIT: and the clear separation of flow and phrasing content). When I integrated the new parser into odoc, I replaced the comment markup generator, and made invasive changes throughout odoc. So it's not a matter of some "unfortunate minor breakage" and maybe a few internal cleanups causing v1 to become v2. It's literally a different unrelated project with different concepts deliberately chosen even at the type level.
EDIT: so how can this be called a "rewrite?" I wasn't "rewriting" Octavius v1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your objection is noted, and your work won't be renamed to octavius v2. We'll have to figure out then how to ensure that octavius v1 ceases to be used by anyone in a future odoc developer meeting (which you are, of course, most welcome to join if you want).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I don't have a specific goal of this not being called octavius
, by the way. Just if it's in combination with the above.
As another thought, it just occurred to me that perhaps nobody working here on this issue has looked in detail at the code of Octavius v1 (could be wrong), and so is immediately aware of how substantially different the two parsers are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm entirely aware of the history of both octavius v1 and this parser, having been around since day 0 of this project :-) My example of dns
was deliberate: it is a total rewrite in exactly the same style as the odoc parsers being discussed here; see mirage/ocaml-dns#159 for more (no common provenances between the codebases).
The point of keeping the same name from an opam perspective is that both the old and new libraries are fulfilling the same purpose. A user who comes along and was using octavius v1 would note there is a new version with a major bump, and follow a migration guide, and eventually be on v2 (with its new interface). The goal is to hide the internal details of how it came to be from users and make a statement that "the successor to octavius v1 is v2". opam has upper bounds available for older consumers of the library.
However, this only really works if there's consensus among the developers of the old and the new. In this case, there isn't (which is just fine), and the fastest way to move on is just to provide clarity to our users about which ocamldoc parser they are supposed to be using as part of the release of the new one. For instance, a quick search reveals Imandra uses Octavius currently.
Rebrand the odoc parser as 'octavius'.
Also tidy the interface quite a lot.