Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: remove fmt dependency? #15583

Closed
cathalgarvey opened this issue May 6, 2016 · 25 comments

Comments

Projects
None yet
10 participants
@cathalgarvey
Copy link

commented May 6, 2016

Would there be interest in a PR that removes fmt as a dependency in encoding/json, if doing so didn't overly complicate the code?

I ask because fmt is a large dependency to pull in for what appears to be exclusively error formatting in decode.go.

Happy to contribute this.

@bradfitz bradfitz changed the title Remove `fmt` from `encoding/json`? encoding/json: remove fmt dependency? May 6, 2016

@bradfitz bradfitz added this to the Go1.8Maybe milestone May 6, 2016

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 6, 2016

Maybe? Got any before/after numbers?

/cc @crawshaw

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 6, 2016

None yet; you mean for binary size or performance-in-error?

Contributing to Go is something of a time investment so I didn't bother doing any work before checking if there was interest.

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 6, 2016

For whatever metric you cared enough about to file this bug.

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 6, 2016

Ha, OK: background then. I'm using GopherJS to transpile some projects to JS for a dashboard. While this is the sort of case where large JS files are OK, because it's a long-running browser App and load-times are OK, I noticed that the binaries were huge.

Two obvious culprits were net/http, which was pulling in transpilations of the entire TLS stack, and fmt, which has a reputation for being large.

So, while Go binaries being large is generally simply taken as a fact of life in go build land, over in gopherjs build land it's another story, because 3.5M JS libraries are essentially useless unless the comprise the whole core product.

..hence my recent crusade to try and reduce the use of fmt.Errorf when errors would suffice. I don't yet know that to be the case in encoding/json, because at least here the f part of Errorf appears to be leveraged, as is the de-interfacing powers of fmt.

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 6, 2016

encoding/json already depends on strconv which has https://golang.org/pkg/strconv/#Quote and is about the only interesting use of fmt in the json package.

So try it and see how much smaller the Javascript gets.

@minux

This comment has been minimized.

Copy link
Member

commented May 6, 2016

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 6, 2016

Is there not a mild intrinsic value in partitioning, anyway? And value also in using the appropriate tool, perhaps in this case errors for minor errors?

As I said, formatting of interface{}s may be important here when I dig in. In which case, fmt may be the best fit for purpose. If not, why keep it?

On 7 May 2016 00:23:24 IST, Minux Ma notifications@github.com wrote:

How many applications will benefit from this?

I imagine it'd be find an application that only uses encoding/json but
not
fmt or any other packages that use fmt.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#15583 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@minux

This comment has been minimized.

Copy link
Member

commented May 7, 2016

@josharian

This comment has been minimized.

Copy link
Contributor

commented May 8, 2016

The other benefit to fewer dependencies is that the dependency graph is sparser, so more things can be compiled concurrently. This applies whether or not the code uses fmt somewhere else. The go command is not as good at taking advantage of this now as it could be, but I hope that that will improve next cycle.

I think the patch could be made nicer by introducing local functions errorf and errorf2 (naming improvements welcome), at which point the patch is basically s/fmt.Errorf/errorf/ or s/fmt.Errorf/errorf2 as appropriate.

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 8, 2016

@josharian: Interesting, I hadn't even considered that; still a lot of compiler-fu to learn. :)

@minux: I don't think that harms readability much, personally? Also, 9% is pretty significant in my book, and may work out even better for downstream hax like GopherJS. For clarity, a variadic local errorf as @josharian suggests would clean things up significantly.

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 8, 2016

@minux: I made some changes to some GopherJS code to avoid encoding/json, using my fmt-free fork of github.com/kurrik/json instead. It shaved about .5Mb from the resulting unminified JS (1.5Mb->961k), which was a bit over 33%. So, while the premium of 9% in pure-Go might seem mild, for GopherJS it's a huge difference.

Addendum; these successful reductions also came from removing fmt from another upstream tool; took me a while to hunt it down in that dependency and I ended up pulling it into my tree. After removing net/http from my GopherJS code, which removed ~1.5Mb, the biggest reduction has been the 0.5Mb reduction from removing fmt. It matters, this can be the difference between using Go and simply using straight JS!)

@minux

This comment has been minimized.

Copy link
Member

commented May 8, 2016

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 8, 2016

Plenty of my packages use JSON but not fmt? Not all applications push out human-readable text output yunno. ;)

On 8 May 2016 20:55:52 IST, Minux Ma notifications@github.com wrote:

The huge reduction probably suggests there is something
very inefficient about the fmt package. I'd suggest figure
out why GopherJS compiler generates so much Javascript
for the fmt package first.

The 9% reduction only applies to toy examples. Again, I
can't think of any real world Go application that only uses
encoding/json, but not (directly or indirectly) fmt.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#15583 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@minux

This comment has been minimized.

Copy link
Member

commented May 8, 2016

fmt is not only about user output.

Of the 155 std packages, 83 of them uses fmt.
Pretty much all io related packages (net, net/http,
os/user, log) use fmt, so I will be very surprised
to learn some useful Go application that doesn't
use fmt at all.

Basically, the only way such a program can interact
with the outside world without fmt package is via
the syscall and os packages. And such a program
can't even use the flag package....

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 9, 2016

It's no surprise that text-heavy tools use fmt, but a lot of packages use it only for Errorf. If the argument is that fmt is unavoidable, and we should learn to love to bomb.. I don't buy it.

If they're all like encoding/json and a little effort could strip it, then why not incrementally decouple the stdlib and remove it?

I've just stripped encoding/xml, and it was harder. Formatting was used more often for quoting and interface conversions. But even then, the types and formats were all the same; string, int, and interface.Type. The full power of fmt was not warranted, merely a helper that would convert and concatenate variadic simply types.

On 8 May 2016 23:33:31 IST, Minux Ma notifications@github.com wrote:

fmt is not only about user output.

Of the 155 std packages, 83 of them uses fmt.
Pretty much all io related packages (net, net/http,
os/user, log) use fmt, so I will be very surprised
to learn some useful Go application that doesn't
use fmt at all.

Basically, the only way such a program can interact
with the outside world without fmt package is via
the syscall and os packages.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#15583 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@spenczar

This comment has been minimized.

Copy link
Contributor

commented May 9, 2016

@minux - I think you're right that it would be rare for fmt to be missing from a meaningful program that uses json... but that's not a reason to keep the dependency. Is there any reason that this mild bit of separation is harmful? While you're correct that the benefits are small, it should be done anyway if there is no downside and the change is simple.

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 9, 2016

I agree with @minux that almost all real programs will bring in fmt anyway and you should first understand why fmt compiles to such large GopherJS output code first.

The other possibility is modify the GopherJS compiler to replace fmt.Errorf with a lighter work-alike. But that's kinda fixing the wrong problem.

I don't want to start making every package depending on fmt have its own fmt-like "minimal" versions. That would mean that in a real program we'd have both the actual fmt implementation as well as all the "minimal" fmt clones, making the size problem even worse.

I'd rather just see the linkers (Go and GopherJS's) fixed to do more dead code elimination, if that's the issue. (sometimes unnecessary generated init funcs pin things that might not otherwise be needed).

But if it turns out that a perfect linker couldn't dead-code eliminate something (perhaps because the code paths to take depend on the Errorf format string), perhaps only then could we change something to give the linker more information. But only if it helps significantly.

@cathalgarvey

This comment has been minimized.

Copy link
Author

commented May 9, 2016

Personally I feel that, architecturally, fmt would probably have more logically been broken into the scan and formatting functions. When I import fmt, am I not also importing the entire Scan tooling? At least in GopherJS this appears to be the case?

As it happens, making "lite" versions for my GopherJS projects is exactly what I'm doing right now. I'm making a fmt clone for just the most common 99% of fmt usecases in the wild: fmt.Errorf for simple errors, and fmt.Println/fmt.Sprintf for simple datatypes and format strings. Most of the extra 'goodies' in fmt are unnecessary for these cases.

Perhaps the most valuable change would be to make errors just a little bit richer, to entice people away from fmt.Errorf for things that don't really call for it. If errors.New had been a variadic that accepts and default-formats inputs (a la "%v"), joining without spaces, then perhaps people wouldn't be using fmt.Errorf so much. :)

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 9, 2016

When I import fmt, am I not also importing the entire Scan tooling? At least in GopherJS this appears to be the case?

A good linker should discard things which are unused. I don't know whether GopherJS can do that.

@randall77

This comment has been minimized.

Copy link
Contributor

commented May 9, 2016

None of the Scanf code is included in a binary if you use only the print functions in fmt.
(This was fixed in https://go-review.googlesource.com/17398/)
As Brad says, this may be an issue in the GopherJS linker, whatever "linker" means in that context.

@dmitshur

This comment has been minimized.

Copy link
Member

commented May 9, 2016

I don't want to start making every package depending on fmt have its own fmt-like "minimal" versions. That would mean that in a real program we'd have both the actual fmt implementation as well as all the "minimal" fmt clones, making the size problem even worse.

I agree with @bradfitz here. I think that trying to remove a single dependency from a package in order to optimize performance is setting a bad precedent. The Go programming language and its compilers are meant to make our jobs, the humans, easier, and I want to continue to be be able to write natural and idiomatic Go code.

I'd rather just see the linkers (Go and GopherJS's) fixed to do more dead code elimination, if that's the issue. (sometimes unnecessary generated init funcs pin things that might not otherwise be needed).

This is where I'd like to see time and effort being spent, rather than trying to eliminate usages of packages in a few places, because that is likely to regress in the future, or cause pain for people trying to write idiomatic Go code because suddenly there are certain packages they shouldn't import (according to fuzzy metrics).

A good linker should discard things which are unused. I don't know whether GopherJS can do that.

But if it turns out that a perfect linker couldn't dead-code eliminate something (perhaps because the code paths to take depend on the Errorf format string), perhaps only then could we change something to give the linker more information. But only if it helps significantly.

GopherJS is able to and currently performs high quality DCE, potentially better than generic JS minifiers can, because it has full access to the original source that's being built and it sees what is statically needed/used. It follows the constraint that it must always produce valid code that works as specified in Go spec; it can't take shortcuts that would be against Go spec.

However, its DCE efforts are significantly thwarted because of run-time reflection via reflect package. That ability means it's not possible to determine at compile type if some types, etc., are not used and eliminate them, because they may be accessed via reflection (which in 99% of cases they are not, but to generate correct programs, that 1% possibility cannot be discarded). /cc @neelance

There is an open issue gopherjs/gopherjs#186 to implement optional "unsafe aggressive DCE" where if you promise not to use reflect to access things that aren't statically mentioned, GopherJS could eliminate them (and if you break your promise, behavior is undefined). However, it's not implemented yet.

@minux

This comment has been minimized.

Copy link
Member

commented May 9, 2016

@dmitshur

This comment has been minimized.

Copy link
Member

commented May 9, 2016

even with reflect, you still can't use a type that has not been created statically somewhere in the program.

I think that's true for types, but I don't think it is for methods. Sorry, I should've been more precise in my post above (instead of saying "types, etc.").

See https://godoc.org/reflect#Value.MethodByName. An arbitrary program can get user input from keyboard or over network as a string and use MethodByName and Call a method that is not statically mentioned anywhere.

Not being able to eliminate methods causes a chain reaction, since those methods will mention other funcs, types and their methods, and usually pull in large parts of Go packages that are not statically used.

See also recent @crawshaw's changes on linker DCE and reflect.

Those are helpful, thanks for pointing them out! For reference, what I found is what's mentioned at bottom of #6853 and https://github.com/golang/go/commits/master?author=crawshaw.

@minux

This comment has been minimized.

Copy link
Member

commented May 10, 2016

@rsc

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2016

No thanks. I think the expectation is that basically every Go program in the world is going to import fmt one way or the other.

@rsc rsc closed this Oct 20, 2016

@dmitshur dmitshur referenced this issue Feb 23, 2017

Merged

Separate internal and net.Conn implementations #20

7 of 7 tasks complete

@golang golang locked and limited conversation to collaborators Oct 20, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.