Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meanings of "downstream" and "upstream" are reversed in the docs, as well as semantics of --propagate and --cascade #999

Closed
pkolaczk opened this issue Aug 1, 2019 · 11 comments

Comments

@pkolaczk
Copy link
Contributor

commented Aug 1, 2019

The compilation of a project always requires the compilation of all downstream projects (its dependencies).

I don't think it's right. Dependencies of something are generally called upstream.

Compile upstream projects
The --cascade flag allows you to change the public signature of a project (say, foo) and have bloop compile all the transitive projects depending on foo to detect compilation errors.

Projects depending on foo are generally downstream from foo.

As a result, options --propagate and --cascade are reversed. :(

The words "upstream" and "downstream" are rooted in production process and have well established meanings. This blog post explains it quite clearly: https://reflectoring.io/upstream-downstream/


Not sure what we can do with this though... It is quite confusing like it is now, but changing it would break backwards compatibility and become a surprise... Deprecate --cascade and --propagate and find different words? Or just fix it in the next major release and document the change?

@pkolaczk pkolaczk changed the title Meanings of "downstream" and "upstream" are reversed in the docs Meanings of "downstream" and "upstream" are reversed in the docs, as well as semantics of --propagate and --cascade Aug 1, 2019

@jvican

This comment has been minimized.

Copy link
Member

commented Aug 2, 2019

First of all, saying "wrong" twice with that arrogance is not an appropiate way of pointing out what you believe it's a mistake on the docs. In this project, the way you communicate things is more important than what you say. I take it that you have good intentions and that your choice of words and abrupt sentences might be a byproduct of the social norms in your native language where such strong sentences are more common, hence to you it reads "normal". It doesn't to me, so I ask you please to be more friendly, humble and polite when interacting with either me or other contributors in this project. I'm sure we can find a way to make this communication work for both of us.

Onto the main concern raised in this ticket: the interpretations of downstream and upstream in the docs and the projects look correct to me and depend on the direction of the arrows of your build graph. If you imagine a build graph as a tree, --propagateing the compilation of B means compiling the downstream project A and --cascadeing the compilation of B means compiling its upstream project C, for a build where C -> B -> A.

That being said, you can't possibly link me to a blog post of a random guy in the Internet interpreting downstream/upstream as s/he seems to like and say that the definition of both are well-established. It isn't.

The interpretation in this project is the natural interpretation based on the dictionary definition of upstream in Merriam Webster: "in the direction opposite to the flow of a stream" and likewise for downstream. This very same terminology is used in source control, see https://stackoverflow.com/questions/2739376/definition-of-downstream-and-upstream as well as download/upload. So I do consider these words to be the natural choice of describing both the actions of --propagate and --cascade.

@jvican jvican closed this Aug 2, 2019

@pkolaczk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2019

I apologize if I used incorrect words. I didn't mean to come as offensive and didn't want to hurt anyone's feelings. However, I wanted to be precise in pointing out what I felt was incorrect and confusing for me.

As for the "blog post of a random guy", I pointed you to a blog post that I feel described the issue very accurately and with which I totally agree in order to save repeating the same text here. There are also nice pictures there. This blog post also agrees with all the other sources I could find, including the one you link.

The SO ticket you point actually confirms my way of understanding words "upstream" and "downstream". See the highest voted answer:

In terms of source control, you're "downstream" when you copy (clone, checkout, etc) from a repository. Information flowed "downstream" to you.

So downstream is down the information flow, upstream is up the information flow. The information flow direction is the thing that's essential here. By going downstream you add more information (more value) to something, but you depend on the upstream stuff, from which you pull (clone) from. The dependency arrows point in the opposite direction than the information flow, so I admit it can be easily confusing.

Now let's go back to multi-module example and compilation stuff and analyze how information flows there.

When you you have projects A depends on B depends on C, you need to first build C, then build B, then finally build A. By manufacturing analogy, C is raw materials, B is mid-product, A is the final product. The material and information stream flows in the direction C -> B -> A. Hence C is upstream, A is downstream. The dependency chain is opposite A -> B -> C, but downstream/upstream refers to information flow, not dependency arrows.

When propagating a change, the change propagates down the information flow, not against it, upstream. By Merriam Webster: "to propagate = : to cause to spread out and affect a greater number or greater area : EXTEND". So if A depends on C, a change in C will affect / extend to (propagate to) A, but modifying A does not affect / extend to / propagate to upstream C. I can easily build module C without building A and B, because C doesn't need any information from A nor B. Hence it is upstream, and changes from it propagate down to B and A.

Onto the main concern raised in this ticket: the interpretations of downstream and upstream in the docs and the projects look correct to me and depend on the direction of the arrows of your build graph.

Now, which arrows of the build graph do you mean? The dependency arrows? I think the disagreement comes form the fact that you refer to dependency graph and I'm referring to information flow graph. The information flow graph is reversed vs dependency graph.

BTW: I don't like the source control analogy for upstream/downstream, because dvcs allow information to flow both ways, and there is also no clear dependency graph. I can pull from the source, but I can also push things upstream back to the source. The "source" and "dependency" split is very artificial. This is different than compiling stuff, where information typically flows in one direction.

@tgodzik

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2019

This discussion seems purely academical and I don't think there is any purpose here. Really changing the parameters right now is more unnecessary work, when there is still things to do and fix. The intellectual effort of understanding how it works in Bloop is rather negligible.

Let's maybe focus on real issues, rather than trying to prove each other wrong. We loose a lot of time and effort this way that might be directed towards something more constructive.

@pkolaczk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2019

The discussion is not academical. This is usability stuff. Using meanings of words in a different way than the rest of the world is doing causes confusion for the users.

The intellectual effort of understanding how it works in Bloop is rather negligible.

Yes, I agree with that. There is this initial "ok, so I need to use --cascade if I want to propagate changes downstream". Then it all works.

But on the other hand changing this is not a big deal. This is just docs + naming.
Better do it earlier than later, so fewer people have to unlearn.

@tgodzik

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2019

I must say cascade does make sense for me when it comes compiling everything that depends on the changed project. But maybe since there is so much confusion when it comes to downstream and upstream we can drop those words from the docs? Make them more descriptive with something everybody can agree on?

@pkolaczk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2019

Yes, I also find cascade ok. I had a bigger issue just with the docs usage of downstream and upstream and --propagate. Seriously, I tried to use --propagate on a compile task thinking it would compile all children depending on the project and I was surprised that it means the opposite thing and goes upstream.

+1 to rephrasing the docs without using upstream/downstream.

@pkolaczk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2019

Ok, I talked to one of our architects and indeed it looks like the meaning of word "upstream" and "downstream" is sometimes understood differently by some developers. Consider them unsafe / confusing.

@jvican

This comment has been minimized.

Copy link
Member

commented Aug 2, 2019

Nope, I'm not removing these words from the docs. Let's move on.

@pkolaczk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 4, 2019

Because?

@marek1840

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2019

Because the terms downstream and upstream are not confusing but relative and to use such terms correctly one has to know the answer to the question "In relation to what I am thinking". When it comes to dependency graphs those can be "A is a dependency of B" and its inversion "B depends on A".

Thinking one point of view is better than the other is not only arrogant but also ignorant of the fact that both of them can highlight different interesting/important aspects of the matter.

Also the fact that until now there wasn't a great number of complaints that the docs "are confusing" suggests that it is not a prevalent issue.

Finally, let's consider the cost VS benefits if doing this change.
What would we gain?
The docs would be more clear to some (previous point suggests: small) group of people.

What would we lose?
Other users could get surprised by the change in terminology. This is almost never a good thing.
Also: time.

So, it is really a non issue until much more users start to complain about the docs being unclear. Enough time on both sides was invested in this topic while it could be solved as easily as just accepting that other points of view can also be valid.

So, thanks for bringing this issue to our attention. I hope that as the result of this discussion the direction of the flow in the dependency graph used in the docs got clearer. But before changing the established terminology, we will need much more feedback from our users.

@pkolaczk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 5, 2019

Because the terms downstream and upstream are not confusing but relative and to use such terms correctly one has to know the answer to the question "In relation to what I am thinking"

Agreed. In bloop context, they are used always in relation to the project that is currently being compiled. The project being compiled is the reference point. Therefore, if I'm compiling project X, then the dependencies of X are upstream and the dependants of it are downstream. The documentation says "downstream dependencies" and this is a misnomer - there is no such thing like a downstream dependency, and I'm not the only one saying that - see the comments under the article I linked.

Also the fact that until now there wasn't a great number of complaints that the docs "are confusing" suggests that it is not a prevalent issue.

Again I agree, but some complaint has to be the first one. :)
A low number of complaints doesn't mean the docs is correct or not confusing. This only means this is an issue of low importance (did I say it was a critical or a major issue? I don't recall that). Early adopters typically don't care much about the docs as long as the software works for them and they can figure everything out. I also initially didn't read the docs, just glanced through it quickly, saw --cascade / --propagate options being mentioned, then jumped straight into testing how they work. I found it a bit weird that --propagate works upstream and is not available on a compile command, but I just moved on.

BTW: I really like what these options do! I miss them in gradle already.

Other users could get surprised by the change in terminology.

Yes, that's always a minor risk. But people who already use the project probably don't read the docs. The docs are for people who learn how to use bloop or what it can do. But I get where you're coming from and that's in my initial post I suggested we don't change the semantics of how bloop works but that we only clarify the docs. Should be pretty easy and wouldn't take much time.

The docs would be more clear to some (previous point suggests: small) group of people.

You don't know how large the group of people that got confused is. I showed the docs fragments to my team-mates and most were confused by the wording. Most people generally don't report problems in the docs when it is no blocking them. Particularly the ones who are only investigating the product. Many people even don't report problems when something breaks. If too much stuff is broken, they just apply a label "not ready for production time" and they move on. Even such an extremely minor issue like lack of deb / apt package was taken as a problem in the debates about bloop that I had with some other guys in the company. Also, if the product solves a painful enough problem (and bloop does!), people would use it fine even without the docs or completely broken docs.

Please note that this project is at the moment at a very early stage, and fixing terminology in later stages when it gets more popular will be much harder and even more people would be confused by the change.

Thinking one point of view is better than the other is not only arrogant but also ignorant of the fact that both of them can highlight different interesting/important aspects of the matter.

This is not a debate about point of views, and not a debate whose point of view is better. These are technical terms. Technical terms typically have or at least should have clear, unambiguous meanings. If somebody reports a problem with the docs, please do a research first or just say you don't have time now, but then leave the issue open. Closing issues with no good reason, cutting off the debate and calling somebody who wanted to help arrogant is arrogant.

Pointing out mistakes and submitting fixes is the foundation of open-source collaboration. I spotted and fixed many problems with bloop already and I managed to make it work with non-trivial 1+ MLOC java/scala project and I always spoke very highly of bloop and sbt/zinc. I'm defending it from people who said the version number should be 0.2 and not 1.x, because they tried it before me, failed miserably and just gave up. As you can see they didn't report these failures here.

@scalacenter scalacenter locked as resolved and limited conversation to collaborators Aug 5, 2019

@jvican jvican added the wontfix label Aug 5, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
4 participants
You can’t perform that action at this time.