Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Flate, Gzip, Zip, Zlib to Compress #8886

Merged
merged 6 commits into from
Apr 20, 2020

Conversation

bcardiff
Copy link
Member

@bcardiff bcardiff commented Mar 6, 2020

Waiting on #8881
Flate is renamed as Compress::Deflate

There are manually deprecation notices that will show if the current files are required, but due to alias everything can keep working and offer a smooth transition.

The alias is marked as deprecated for docs. The @[Deprecated] is not supported on alias declaration / docs.

@asterite
Copy link
Member

asterite commented Mar 6, 2020

I'm not sure about this. These are pretty well-known names. If we do this, should we also put JSON, YAML, CSV and XML under a Serialization module? I don't think these names would ever conflict with user code exactly because they are well-known.

@asterite
Copy link
Member

asterite commented Mar 6, 2020

(on the other hand it's convenient because they will be grouped under a namespace so you could see what the compress options are available in the std, but that's all... at the price of always typing Compress in usages)

@bcardiff
Copy link
Member Author

bcardiff commented Mar 7, 2020

I agree with @asterite . Typing Compress:: all the time is a bit inconvenient.

The discovery of related modules can be handled by docs by other means.

The names are based on formats and RFCs. If a shard was to provide a drop in replacement it is possible to do so (since shards hide std-libs)

And there is little reason for conflicts to be generated.

@straight-shoota and @RX14 where the most interested in this change.

Maybe we can only rename ::Flate to ::Deflate.

@Indribell
Copy link

I agree with @asterite . Typing Compress:: all the time is a bit inconvenient.

How many times do you really write compress routines? Maybe one or two times in a single module. It really is not that much more work, for in exchange a more cleaner library...

Compress::??? Shows all compression and uncompression. If anything new gets added to the library in the future and people do not read the docs, they will instantly spot any new compression modules, when their IDE picks it up. The same thing will not so easily happen if everything is mixed together.

should we also put JSON, YAML, CSV and XML under a Serialization module?

Might also a good change. Do not forget that not everybody is a experienced programmer. For some people formats like Yaml may be new. They may know Json and Xml but go ??? on CSV/YAML... By putting it under the same category, it already pre-informs people what those are.

Better then having them as lose entities all over the library... It makes the standard library look more clean, in exchange for maybe a few dozen extra words ( what is 2 or 3 clicks with any half descent IDE ) in a large project.

@bcardiff
Copy link
Member Author

bcardiff commented Mar 9, 2020

Rebased on master since #8881 was merged.

I'm still ambivalent on this change. They are all formats and not libraries. I see no need to protect against possible changes of underlying libraries. And there are good names with little chance to clash imo.

@crystal-lang/crystallers ?

@asterite
Copy link
Member

asterite commented Mar 9, 2020

I agree with @Indribell that being able to group these type in the docs is good. But for that I would introduce something like tags or doc sections. I see no reason to introduce the scope in code.

@jhass
Copy link
Member

jhass commented Mar 9, 2020

I still kind of like this, but no hard feelings.

@j8r
Copy link
Contributor

j8r commented Mar 9, 2020

I am not sure what problem this PR solves?
AFAIK no one complained (yet).
It feels like just a matter of taste, one could revert this change because (s)he prefers top-level objects.

It would be OK if this new module brings new things, like methods - but that's not the case here.

@straight-shoota
Copy link
Member

The benefit is to better structure the stdlib components by grouping related concepts in a module namespace.

This change by itself is obviously neither strictly necessary nor does it bring strong immediate benefits that couldn't be reached by other means (such as a grouping feature in the API docs). It still helps uncluttering the top-level namespace.
But looking at the bigger picture there's a strong interest in a universal appearance of how individual stdlib features are organised.

Random, Crypto and Digest namespaces are already existing examples of such a structured organization. Compress just applies the same rationale on compression-related libraries.

Going the other way around for reaching uniformity is certainly a much weaker move because it clutters the top level namespace and introduces even more breaking changes.

So, while I understand and share some of the resentments against this change, I thin it is still a net win.

As noted above, we should also consider similar adaptations to other areas but they need to be discussed individually. Regarding Serializable for example, I'd suggest to hold off for exploring a potential implementation of a generic serialization library.

@j8r
Copy link
Contributor

j8r commented Mar 9, 2020

That's a fair point @straight-shoota , I wasn't aware of the bigger picture. It would be great to have an issue explaining this rationale, and have #8881 and this one implementing it, with a mention of Serializable.

@straight-shoota
Copy link
Member

Yeah sorry, this process originated in a private discussion among core team members and the context has been lost for anyone not involved in that. There should've been better communication about it.

@RX14
Copy link
Contributor

RX14 commented Mar 11, 2020

If we do this, should we also put JSON, YAML, CSV and XML under a Serialization module

These are frequently used formats and modules which people reach for on a daily, even hourly basis. They deserve a top-level name.

How often do you reach for the DEFLATE encoding? To me, the reasoning is purely popularity.

src/flate.cr Outdated Show resolved Hide resolved
@RX14
Copy link
Contributor

RX14 commented Mar 30, 2020

There's been some time to think and for everyone to re-read the thread and the reasoning for and against, so I think it's time for a vote.

👍 to move these infrequently used formats into a namespace
👎 to leave it as-is

@Indribell
Copy link

@RX14

I understand that you want to put a small vote up. But when you exclude one of the main issues, your kind of rigging the vote into what you favor.

The vote needs to be:

to move all formats into a namespace
to move infrequently used formats into a namespace
to leave it as-is

Moving infrequently used formats into a namespace, simply makes things more confusing because you have formats on the top and formats in namespaces. While it cleans up the namespace somewhat, who decides what is popular and what not?

Your running into a slippery road for future discussions:

Is Yaml popular? In my opinion not because i never need it but it may be super popular.
How about CSV ... who uses CSV anymore? But maybe its more often used then we realize.
...

I give you the lesson of PHP. If you mix and match now, you run into the issue later that you can not move formats in or out of a namespace ( things change in our IT world more frequently then we wish ). X format is everybody their popular option today. But what 5 years from now. 10 years... Maybe Y, Z, A, B also become very popular in 5 years time and before you know it, the top level namespace gets populated again because your scared to remove the older stuff but need to keep putting new stuff in.

When Crystal is 1.0 and more code is actively developed upon by end users, the issue of backward comparability will also show its ugly head. PHP made those mistakes in the past and people still hound the language to this day for "strlen" vs "str_repeat" type of shenanigans.

Half measures never result into good user experiences.

@didactic-drunk
Copy link
Contributor

The vote needs to be:

No it doesn't. Classes can easily move between namespaces a few at a time or not at all. A polluted root namespace wastes more time for everyone reading documentation than it saves for the few that type Compress::.

Having grouped documentation allows new programmers to read the docs seeing everything grouped by function at the same time or skip sections they don't want to see. Deflate, Gzip, Zlib, puts compression classes all over the place in an alphabetical listing. Not to mention Brotli, Snappy, Zstd, etc.

When someone makes a collective shard documentation server this namespace will provide the automatic grouping that many people look for and expect.

PHP made those mistakes in the past and people still hound the language to this day for "strlen" vs "str_repeat"

That's not a fair comparison. Moving a few classes to a new namespace is much easier to handle than changing an entire API.

@RX14
Copy link
Contributor

RX14 commented Apr 14, 2020

Well the result seems to be an even split among the core team, with more community members coming down on the side of moving it. Can we agree to moving these modules then?

I think the "slippery slope" argument - or taking the "move" argument to extremes - is a bit daft when we're the ones in charge of any changes.

@bcardiff
Copy link
Member Author

Closing in favor of #8952

Since this compression modules are sound to be user directly, then having them as top level is more comfortable that using a longer namespace.

The criteria discussed along the thread of time sensitive popularity also weights in for this choice.

Improving the discovery of them as "related types" can be handled in docs and/or guides.

There was also not enough consensus in #8886 (comment)

@bcardiff bcardiff closed this Apr 14, 2020
@didactic-drunk
Copy link
Contributor

didactic-drunk commented Apr 14, 2020

Since this compression modules are sound to be user directly, then having them as top level is more comfortable that using a longer namespace.

Having them at the top level makes them scattered (Deflate, Gzip, Zlib) and harder to read when looking through documentation for everyone who isn't looking for compression (the 99.999% use case). So no it's not more comfortable. Your conclusion is the complete opposite of mine, the way I look for information and work.

@asterite
Copy link
Member

and harder to read when looking through documentation for everyone who isn't looking for compression

What do you mean? There's a search box

@didactic-drunk
Copy link
Contributor

Looking at the left list of classes and scrolling. Extra classes are jumbled in that 99.999% of the time people don't want to see and would be clearer if they were under a collapsed namespace of "Compression".

It's also not clear what is or isn't compression. Yes this can be handled with search. My point above can't.

@asterite
Copy link
Member

@didactic-drunk This is a matter of how the classes are grouped in the docs, not how they are namespaced in code

@didactic-drunk
Copy link
Contributor

didactic-drunk commented Apr 15, 2020

@asterite Yes, that's what I'm talking about. Below the search box is a list of classes. Using Compression reduces the clutter, groups them by function and makes for easy reading. Similar to Digest.

@straight-shoota
Copy link
Member

If that is the case, only 6 positive votes and 3 downs are not enough sample of the entire community or contributors.

We don't need an elaborate voting process. We just need to make a decision. In the end this topic is by far way to irrelevent to keep discussing forever. The vote was intended to come to a conclusion by simple majority vote. That would be enough to justify going either direction, just by what majority votes.

This simple issue shouldn't be such a hassle.

Even more, the 3 votes downs are core contributors.

Of the 6 upvotes there are also 3 from core contributers. So if we want to make core contributer's votes count more, the downvotes would be offset by them.

not to mention the unbalanced situation that brings for other components (formats or libraries)

I see the current situation is unbalanced and this is actually a step to reduce that. See #8886 (comment) and #9077 as an extreme example.

@RX14
Copy link
Contributor

RX14 commented Apr 15, 2020

I do consider this issue "fairly irrelevant" which is why I proposed a simple vote to resolve this. It's a shame that it didn't despite there being a clear winner, however you weight the votes.

I think it's unlikely we'll reach consensus by discussing further, let's let the vote stand.

@didactic-drunk
Copy link
Contributor

Should we also move Array, Hash, Set, Deque to Containers::?
Should we move File and Dir to Filesystem::?

Frequently used classes can stay in global. Infrequently used classes can move to a namespace. That's all that I've proposed (I won't speak for anyone else today).

Uses of classes in Crystal by the numbers:

Class Count Without System and Compiler
File 180 57
Dir 37 8

Compression classes

Class Count Without System and Compiler
Gzip 2 2
Zip 0 0

How many shards use File vs Gzip? 100 to 1? >1000 to 1? I don't have those numbers but I would guess no shard authors will care if it moves to a namespace where they type Compress:: 2-3 times.

Disclosures:
The data above is shown with and without the compiler included to avoid bias and let you form your own opinions. [file, dir, io, path] were skipped to avoid rigging the numbers with self references. zip was skipped for false positives with Zip::File. Actual File use is slightly higher.

Commands used:
egrep -r 'File\.' * | egrep -v '^(system|crystal|compiler|file|dir|io|zip)/|^(file_utils|dir|file|io|path)\.cr' | wc
egrep -r Zip:: * | egrep -v '^(zip)\/' | wc

@didactic-drunk
Copy link
Contributor

We'll be avoiding polluting the global namespaces with names that will not be taken anyway. So what's the problem

I keep pointing out the problem but either my description of cluttering the class list of the left side of the documentation is not comprehensible or ?. My point has not been addressed by core.

@bcardiff
Copy link
Member Author

Since there is no consensus the defacto is to not change things that are working. That's why I closed the PR.

I think the discussion is not only of this PR change but the general opinion whether things should be grouped in namespaces. That is why we keep discussing things.

Digest was one example and we agree to move them; but as we check other cases there is no consensus if that should apply in every case. Maybe Digest grouping was wrong in that sense; but I prefer to not revisit that.

To bring another example to the table: it was also considered if TCPServer, TCPSocket, UDPSocket, UNIXServer, UNIXSocket could be removed from the top-level inside the Socket.
But there was no consensus there either. But we are not having that conversation because there is no PR; because it was clear ahead of time that not everybody wanted that change.


Sending these cleanup PRs is not the same as vouching strongly for them. I am trying to apply what I understood it was wanted during the review of the top-level. In some cases that went smoothly. In others, with the PR, thoughts can be revisited and the individual decision changed and that is fine.

@RX14
Copy link
Contributor

RX14 commented Apr 16, 2020

The problem is that

There's double the votes to move than there is to keep it as-is. This does not sound like "no consensus" to me

Do we need to have stronger rules for how votes are held? Otherwise, how can we maintain confidence in voting?

If there was actually no consensus, I wouldn't be as persistent on this issue.

@asterite
Copy link
Member

Do we need to have stronger rules for how votes are held?

I just don't think you design a language and a library by voting. Voting can be useful to gauge interest, but I believe there must be someone, or a group of people, that know the project from the beginning and have a global vision for the language.

That said, I'm now ambivalent about this. But since this is just a minor thing I don't think we should be putting more effort here.

@waj
Copy link
Member

waj commented Apr 16, 2020

@RX14 this is not a contest, so design decisions are not going to be based on votes. I strongly consider facts and opinions though, but it's not the count of thumbs up/down what leads the result. I even voted down and I'm not taking that into account to think now and I'm still open to merge this if we decide to.

Some time ago we decided with the core team to proceed with a series of (breaking) changes to unclutter the top-level namespace. This was mostly for objects accessible from the top-level that didn't make sense outside the Crystal compiler for example. But Crystal, as Ruby, is not a namespaced language. So there is no real need to group related things together under a strict rule. That lack of strict rules is also a double-edged sword. It's what leads to this kind of discussions where it's more about "feelings" than logical analysis.

I don't buy either the arguments related to organization of the documentation. I'd love to see some kind of categories that the doc generator uses to make it easier to navigate. To be honest, if we add such system, I would be more inclined to move some classes out to the top level, rather than the other way around.

There are cases where the opposite is true. It's clear sometimes that a class doesn't provide real value by itself. So it's common to put those under another module/class to hide it a little bit from normal use. But that is of course neither a rule to not put other types on the top level.

Speaking specifically about these classes and giving it a second thought, I think they could go under a Compress module, since they are all intimate related, all are more or less implementations of the same thing. Again, I'd not extrapolate a general rule from that to apply to the rest of the standard library.

About the burden of having to write more each time, in the future we could consider having an "private include" feature to alleviate those cases.

@straight-shoota
Copy link
Member

this is not a contest, so design decisions are not going to be based on votes. I strongly consider facts and opinions though, but it's not the count of thumbs up/down what leads the result.

This is abolutely right. We should make informed decisions, not blindly choose a majority vote. But in this case there are good arguments for either way, none strictly stronger than the other and everyone showed support for accepting either outcome. So the understanding is we can't pick wrong, but we need to pick one. In such a situation it is perfectly fine to draw a vote to come to a conclusion. It doesn't overturn an objective design decision because we could not identify a qualitative difference between them. Because of that the discussion could have been endless without a decisive outcome. The idea was to cut a corner and move on. It seems that has failed.

@waj
Copy link
Member

waj commented Apr 16, 2020

The idea was to cut a corner and move on. It seems that has failed.

No, it didn't. This seems like a small thing but are decisions that are hard to push later.
Votes are still considered, but I just wanted to point out that we shouldn't encourage to make decisions based on votes. Or that the fact that there was a voting doesn't imply the decision was already taken.

@RX14
Copy link
Contributor

RX14 commented Apr 16, 2020

I'll echo what @straight-shoota said:

I completely agree that design by committee and voting are not the right tools to design a lot of a language, but they are useful tools to have for issue like this which "don't matter" either way and you'd just like to resolve.

@RX14
Copy link
Contributor

RX14 commented Apr 16, 2020

I don't see core team members overusing voting, it's always been a rare tool for us. But when we have voted in the past, it has been honoured. That breaking of expectations has frustrated me.

@didactic-drunk
Copy link
Contributor

Since there is no consensus

Votes are now 7 to 3.

Some of us aren't happy with that excuse. Manastech said no and everyone else who voted said yes. If you said "Manastech doesn't want it" or "We're not honoring votes" I'd see those as an honest answers.

Or that the fact that there was a voting doesn't imply the decision was already taken.

We never had a say. That's useful information. Now I know there are only 3? people I need to get approval from. Please don't take this as a snarky comment. I've been trying to get some changes through for > 6 months.

@asterite
Copy link
Member

@didactic-drunk It just happens that those contributions are breaking changes. Deciding to do a breaking change when it doesn't come from us is a hard decision to make. We happily merge PRs that are fixes and improvements that are not breaking changes.

@didactic-drunk
Copy link
Contributor

I don't buy either the arguments related to organization of the documentation.

I don't buy the arguments related to inconvenience in typing based on numbers I gathered and published that anyone is welcome to refute. I'd gather results from shards to prove the point but I don't think Manastech would consider it.

#9077 was an illustration of what lack of namespaces looked like. To me namespacing of Digest and Compress are approximatey equal in usage so they should have the same form.

You're not considering votes or evidence. Or saying what criteria is necessary to get approval. That leaves me guessing as to what will or won't be approved.

Again, I don't see how Digest and Compress are different. Both have well known names of classes that are unlikely to clash. No one is suggesting namespacing often used classes. What is the actual criteria of when to use a namespace or does that need to be discussed first?

@asterite
Copy link
Member

Just to calm down and settle the discussion: we had a small conversation with @bcardiff and @waj about two hours ago and we decided that it probably makes sense to introduce the Compress module (but just that module), specially considering that we can do private alias Zip = Compress::Zip or we'll be able to do private include Compress in the future.

That's why there was a force-push recently, and we are moving forward with this.

@girng
Copy link
Contributor

girng commented Apr 16, 2020

While there is good intentions, ultimately, hiding class names behind namespaces will hinder the time it takes a developer to find a class/module. Obviously it's not going to be completely hidden, since they can search the docs, but now they have to write extra and use ::. IMO, this goes against Crystal's simplicity and syntactic sugar.

@bcardiff bcardiff added this to the 0.35.0 milestone Apr 16, 2020
@bcardiff bcardiff merged commit 989b7fb into crystal-lang:master Apr 20, 2020
toddsundsted added a commit to toddsundsted/kemal that referenced this pull request Apr 23, 2020
This change explicitly requires dependent code so that transitional
types aliases (and deprecation warnings) are included. Paves the way
for the release of 0.35.0.

See: crystal-lang/crystal#8886
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.