Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support &str in the printer algebra #611

Closed
lpil opened this issue May 26, 2020 · 18 comments
Closed

Support &str in the printer algebra #611

lpil opened this issue May 26, 2020 · 18 comments
Labels
help wanted Contributions encouraged

Comments

@lpil
Copy link
Member

lpil commented May 26, 2020

Currently the Document enum structure can only contain Strings, so any time we have a &str we need to allocate a String and copy the &str into it.

Add a new enum variant for Document which can hold a &str, allowing us to remove lots of cloning from the Erlang printer and Gleam formatter.

This will require adding an explicit lifetime to the Document enum.

@lpil lpil added the help wanted Contributions encouraged label May 26, 2020
@scrabsha
Copy link
Contributor

scrabsha commented Jun 2, 2020

Have you considered using Cow instead of String instead?

@lpil
Copy link
Member Author

lpil commented Jun 2, 2020

I had not! That could be a better solution :)

@scrabsha
Copy link
Contributor

scrabsha commented Jun 2, 2020

Nice! Can I start working on it?

@lpil
Copy link
Member Author

lpil commented Jun 2, 2020

That would be great! Thank you!

@scrabsha
Copy link
Contributor

scrabsha commented Jun 3, 2020

After some work, here is a report of my progress so far:

I have uploaded my code on my fork, on this precise commit.

The codebase still have 5 borrowing errors here and there. I am not sure about how to fix them.

I'll continue to work on that, but I may need some help.

@lpil
Copy link
Member Author

lpil commented Jun 3, 2020

Thanks for the patch. I checked out your commit but was unable to determine how to resolve the ownership problems while using Cow. I'm afraid I am far from an expert when it comes to Rust and I've not used Cow before. :(

@scrabsha
Copy link
Contributor

scrabsha commented Jun 3, 2020

Don't worry, we'll make it work :)

@scrabsha
Copy link
Contributor

scrabsha commented Jun 5, 2020

I spent a lot of time today fixing these errors. The solution was to add more and more Cow. This led to many other errors, which themselves required more Cow, and so on... The code is not entirely pushed yet, I'll push it when cargo check will report no error.

Also, I notice that the final PR will be very huge. I am currently at +644, -313, and expect it to grow, maybe to +700. As such, I think it would also require a lot of time to review.

Finally, once the PR is accepted, I'll try to create a derive macro on a separate crate which will make Cow handling a bit easier. Hopefully, this would allow us to remove some boilerplate introduced by this PR. Of course, it will have a pun-ish name :)

@scrabsha
Copy link
Contributor

scrabsha commented Jun 7, 2020

So this if compiling, and tests pass. There as still a lot of clones to remove, but it will be way easier now.

In order to reduce the amount of merge conflicts and the size of the PR, I'd like to split them. Are you okay with it?

@lpil
Copy link
Member Author

lpil commented Jun 7, 2020

Sounds great! :)

@scrabsha
Copy link
Contributor

Hi,

Disclaimer: I use a very formal tone here. While I don't like this way to communicate, I need to ensure that I'll be understood. English is not my native language, sorry about that.

As I said previously, I am pretty sure I still don't understand what is exactly expected here. So I'm back with some questions.

  1. Exact lifetime for Document:
    The most general way to fix this issue would be to add a generic lifetime to Document (say, 'a), and to add, as the issue description suggest, a BorrowedText variant, containing an &'a str. However, it would be easier to support static str in a first approach. This would allow us to make for instance "foo".to_doc() require no allocation and no copy. Unfortunately, this would not fix the issue since it has no impact on non-static str, but is already a good start.
    In order to make pull requests easy to review, I'd like to first do the "static str approach", then submit a second PR, which introduce a generic lifetime, and would close this issue. Are you ok with it?
    In my opinion, the current Text variant should be renamed OwnedText. I think it makes sense to show an explicitly-owned and another explicitly-borrowed variant. What do you think of it?

  2. About the Document::Break variant:
    The Break variant of Document contains two String (the broken and the unbroken fields). Should I also create a BorrowedBreak variant, with the same properties as BorrowedText?

  3. Exact meaning of "algebra":
    Until then, I was convinced that what is called "the printer algebra" refers to the printer algebraic data type, which is in fact the Document enum. Is this correct?

Have a nice day :)
Sasha

@lpil
Copy link
Member Author

lpil commented Oct 24, 2020

Hi @scileo! I wanted to appologise for not getting back to you sooner on this. I thought I had replied but I must have made a mistake, I didn't intend to leave you waiting for so long!

The most general way to fix this issue would be to add a generic lifetime to Document (say, 'a), and to add, as the issue description suggest, a BorrowedText variant, containing an &'a str. However, it would be easier to support static str in a first approach. This would allow us to make for instance "foo".to_doc() require no allocation and no copy. Unfortunately, this would not fix the issue since it has no impact on non-static str, but is already a good start.
In order to make pull requests easy to review, I'd like to first do the "static str approach", then submit a second PR, which introduce a generic lifetime, and would close this issue. Are you ok with it?

This sounds good, though I don't quite understand why a static lifetime would be needed here? Could we start with a parameterised lifetime? Document<'a>?

In my opinion, the current Text variant should be renamed OwnedText. I think it makes sense to show an explicitly-owned and another explicitly-borrowed variant. What do you think of it?

Sounds good. Alternatively String and Str, to match the Rust String and str types.

About the Document::Break variant:
The Break variant of Document contains two String (the broken and the unbroken fields). Should I also create a BorrowedBreak variant, with the same properties as BorrowedText?

I was wondering if these could actually just hold a str instead of a string. I've not checked all uses of it yet but I think we don't have any dynamic strings there.

Until then, I was convinced that what is called "the printer algebra" refers to the printer algebraic data type, which is in fact the Document enum. Is this correct?

Yes that's right. The language is taken from the "A Prettier Printer" and "Strictly Pretty" papers that this module is based on. Perhaps they're a little unclear, but I think it's useful to stay close to the reference material.

Thank you again, let me know if you'd like to continue with this work after my long delay replying!

@lpil lpil added this to the v0.13.0 milestone Jan 2, 2021
@lpil
Copy link
Member Author

lpil commented Jan 6, 2021

I had a go at this this week and I found it very difficult to get the code to type check after adding the lifetime for a &str. This'll need Rust skills better than mine !

@lpil lpil removed this from the v0.13.0 milestone Jan 6, 2021
@scrabsha
Copy link
Contributor

scrabsha commented Jan 9, 2021

Hi!
First, I'd like to apologize for not answering here for so long. Some life events dramatically reduced the amount of time I can invest in open source software. I know this is not correct, but had no other choice.

I am still wiling to work on this. The problem is that I don't know when I will pr something that will close it. As such, I think the best thing to do is to leave me unassigned, and hopefully I'll come back later with a patch that would close the issue.

Also: trying to fix this greatly improved my understanding of lifetimes, but gave me a lot of headaches :)

@lpil
Copy link
Member Author

lpil commented Jan 10, 2021

Hey! Please do not appologise, I am grateful for your help and I have no expectation that you do any more, especially if you have more important things to attend to.

After reviewing you last pull request again I think I might have a better idea of how to add the lifetime so I may have another go in future. Thank you!

lpil added a commit that referenced this issue Jan 10, 2021
@lpil
Copy link
Member Author

lpil commented Jan 10, 2021

I figured it out using your work as a guide. Thanks @scrabsha !

@lpil lpil closed this as completed Jan 10, 2021
lpil added a commit that referenced this issue Jan 10, 2021
@scrabsha
Copy link
Contributor

@lpil I quickly reviewed your work, and it sounds awesome to me!

Do you have any idea about why my implementation was so huge and unbearable while yours feels way lighter? Is this solely caused by the use of Str and String variant in Document instead of my Cow<str>?

I'm not trying to trick you, I just want to know what trick allowed you to make it so simple :)

@lpil
Copy link
Member Author

lpil commented Jan 18, 2021

I'm not really sure! My last attempts were noisier than your Cow one but this latest version worked out OK. I'm mostly just glad we figured it out in the end, took me quite a few attempts 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributions encouraged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants