Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax highlighted output #4

Closed
Enselic opened this issue Feb 15, 2022 · 10 comments
Closed

Syntax highlighted output #4

Enselic opened this issue Feb 15, 2022 · 10 comments
Labels
enhancement New feature or request

Comments

@Enselic
Copy link
Member

Enselic commented Feb 15, 2022

It would be nice to syntax highlight the output. Should be pretty easy.

  1. Create a custom sublime syntax for the output format
  2. Use the bat library to highlight the output
@Enselic Enselic added the enhancement New feature or request label Feb 15, 2022
@Enselic Enselic changed the title Syntax highlight output Syntax highlighted output Feb 15, 2022
@douweschulte
Copy link
Collaborator

douweschulte commented Mar 30, 2022

image
I do agree it looks nicer that way. Here I just bodged my way until it worked. I used the syntect crate to supply the highlighting rules. If you want to see how I did it, see my fork: https://github.com/douweschulte/public_items.

I would like to add that for me highlighted differences in the changes list would be really nice, maybe even more than syntax highlighting. Something like the removed line the changed text in red and on the added line the text in green.

@Enselic
Copy link
Member Author

Enselic commented Mar 30, 2022

Super big thanks for prototyping that! That indeed looks much better. I looked at how you implemented it, and have the following comments. I understand you just did a quick and dirty proof of concept, but I would like to give some feedback anyway :)

  • I would prefer to have the highlighting performed in the convenient cargo public-items wrapper (i.e. the project this issue is in) rather in the core public_items library, that I would like to be focused on raw processing of public items.
  • For robustness I think we need to write a custom syntax definition rather than using the real Rust syntax.
  • The exact textural representation of public items is still subject to change and tweaks. In particular, fixing https://github.com/Enselic/public_items/issues/16 will probably change it a bit. But some aspects are of course extremely unlike to change.

I agree a more user-friendly changed-item diff would be great to have. I filed #11 for it.

@douweschulte
Copy link
Collaborator

Thanks for the feedback points. But how would a custom syntax be more robust than the real Rust syntax? I would think that catching all edge cases etc would be pretty hard. I would guess that using some library would be less work, especially in the light of syntax changes/additions in upcoming versions of Rust.

I would agree that pure highlighting could just as well or beter live in the wrapper. But maybe it would be nice for other uses of the PublicItems to know its composition. I was thinking about something like syn, but maybe that is a bit OP. But having the public items fully tokenized could give rise to extra uses. Like having the comparison for Enselic/public-api#11 done on a token level instead of text level, which would make it more robust. This could also potentially make it possible to find out if a change is really breaking (the function accepts less things) or something else. But that is maybe not your vision for this project.

I would be happy to help. I am quite interested in where this project could go.

@Enselic
Copy link
Member Author

Enselic commented Mar 30, 2022

Thanks for the feedback points. But how would a custom syntax be more robust than the real Rust syntax?

I base that mostly on that the following example does not highlight properly in practice:
Skärmavbild 2022-03-30 kl  21 27 32

But it also is not robust in theory, since my tool (at least currently) uses non-standard syntax constructs such as pub struct field some::Struct::field. And there are no semicolons to end "statements", to give another example. So I don't really see how it can be robust to use the regular Rust syntax definition. On the other hand I am very open to being proved wrong. But if we end up needing a custom syntax definition, I think it is a bit too early to write it, because the exact format of the output is still in flux.

I was thinking about something like syn

That is a very interesting idea that didn't cross my mind! I agree that has a lot of potential to be a powerful approach. I have been struggling with finding an internal representation of PublicItem that feels right. Representing a PublicItem with syn is perhaps the answer. If you could play around a bit with that I would be very grateful. I will try to avoid making big changes to the code base for a while so you don't get nasty conflicts. But do a final git fetch before you start because I just merged a PR :)

Edit: One more PR: Enselic/public-api#51

I would be happy to help. I am quite interested in where this project could go.

I couldn't be happier if you want to help out :) Feel free to ping me about anything at any time.

@douweschulte
Copy link
Collaborator

image

Okay I got a bit carried away. I created a new file tokens.rs with a definition for the tokens/structure as useful for this case. Essentially it is a compressed form of the rustdoc_types tokens with even some of them still in use. This is then 'leaked' in a bad way to the cargo crate. There the tokens are used to colour the output, for now I just did something with the colours. But I do like the 'automatic' theming by limiting the colours to the standard ANSI colours. For convenience I added a dependency to ansi_term.

So no extra dependencies are needed for the tokenization and the colouring is as robust as it gets. Some details have to be filled in I tried to leave TODO: everywhere I knew needs some polish. But overall it should never crash, in the worst case it will fall back to the previous output.

I will next try an implementation of Enselic/public-api#11 with this new token setup.

Note: I got in a fight with git, sorry for the PR, and the history in my fork may be a bit mangled.

@Enselic
Copy link
Member Author

Enselic commented Mar 31, 2022

Awesome. I will take a deeper look at the code as soon as I get time. Feel free to create PRs that we can use for casual discussion and early feedback. Until such PRs exists (if you are OK with creating them of course), here are some casual comments and questions on the code so far:

  • Instead of lazy_static and lazycell, consider using once_cell which is better in many ways. FWIW, here is an example PR that replaces lazy_static and lazycell with once_cell.
  • I can tell you keep print_with_headers() around in public_items. Now that you moved that code to cargo-public-items it is probably not needed in public_items anymore? If you really want it back we can certainly discuss it, but seems like it is not needed any longer. Let me know if you want some tips on how to conveniently work with those different git repos. I can see you found the path = /Users/yourself trick at least. I have considered putting them both in the same repo, but at the same time I like things to be simple, and having them in different repos creates a nice isolation of concerns in a way.
  • I can see that #![deny(missing_docs)] bothered you, so I will move that check out to CI. I think it is too annoying to keep around "hardcoded" like that, especially when prototyping.

Overall, your changes in both public_items and cargo-public-items look very promising to me. Would you be interested in becoming co-maintainer of both public_items and cargo-public-items, on both GitHub and crates.io? There would be absolutely no obligations, so you are free to not do anything at all. But I want my project(s) to succeed, and am not entirely comfortable with me being the only one with maintainer rights. We can also consider setting up an GitHub org to not have to work against my personal git repos. I would love to get your thoughts on this. Do you think moving the repos to a new GitHub org would be a good move? In general, having the repos in a GitHub org feels more "professional", I think. And the repos might seem less like random (potentially abandoned) projects on first glance by others.

@douweschulte
Copy link
Collaborator

It is all very early stages, I figured I have to change the token representation quite a bit again for Enselic/public-api#11. But I will definitively open a PR as soon as possible. Regarding your feedback, I am not using lazy_static anymore. And I will remove the print_with_headers() from public_items. For now the two repos seems to work quite nice, especially now that I have gotten a bit of an idea how everything is organised, but if you have a trick to automatically rebuild the cargo source when the library has changed I am all ear. The #[deny(missing_docs)] (and the additional #[deny(clippy::missing_docs_private_items)]) should be enabled, I will enable them before merging any PR, but for now while the API is unstable I just won't bother.

I do get the feeling of not wanting to be the sole account having the credit to change code, I have that same feeling on my crate. So if you feel like it fits I would be honoured to get maintainer access, but in the general case I would advise to wait until after a first successful PR. For my own crate I am still using my own name, I do not think moving it to an organisation is worth it until you have a set of crates with multiple maintainers and you need to gracefully handle access and such. For looking abandoned it is most important to keep working on it I guess, and creating commits/issues/prs/releases. That is more important than having an organisation or not.

@Enselic
Copy link
Member Author

Enselic commented Mar 31, 2022

if you have a trick to automatically rebuild the cargo source when the library has changed I am all ear.

I think I don't quite understand your question, because when you change to path = ... for public_items in Cargo.toml then public_items should be rebuilt whenever there is something to rebuild.

In any case, I just added some development tips here: https://github.com/Enselic/cargo-public-items/blob/main/doc/development.md.

Thanks a lot for sharing your thoughts on creating an org. We can wait with that. And I am also fine with waiting to give you admin rights until you have landed your first PR. And I want to stress that I think it is important for open source work to be voluntary and without stress, so please don't feel like you need to hurry create a (production ready) PR. But I am very much looking forward to it :) Let me know when you feel ready for an early review of any draft PR you might have.

@douweschulte
Copy link
Collaborator

Those tips look nice, very handy for new collaborators. I do agree with your view on open source work, and I am really happy that you explicitly state that. I will work a bit more on the draft PR I made now, when I think its ready for a review I will tag you in there.

@Enselic
Copy link
Member Author

Enselic commented Apr 20, 2022

Closed by Enselic/public-api#23. Thank you so much @douweschulte 🙏

@Enselic Enselic closed this as completed Apr 20, 2022
Enselic added a commit that referenced this issue May 22, 2022
Loop instead of recursion to get fully qualified name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants