Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: image: add generic metadata support for jpeg, gif, png #33457

Open
drswork opened this issue Aug 4, 2019 · 22 comments

Comments

@drswork
Copy link

commented Aug 4, 2019

Proposal

Currently the standard image libraries provide no way to read or write most of the metadata associated with an image. I'd like to change that so that the standard image loading libraries surface the metadata in images they read and allow code to annotate images with metadata.

Each image format has its own metadata. Additionally there are two common image metadata formats (eXif and XMP) which are each used in multiple image formats. (XMP is used in GIF, JPEG, and PNG files, while eXif is used in PNG and JPEG files) The formats themselves also have some interesting quirks which make excessive cleverness unwise. (There are three(!) separate key/value stores attached to PNG files, for example, and the format doesn't disallow a key being used multiple times in the different, or even the same, kv store)

With this in mind, the three guiding principles behind this proposal are:

  1. Metadata should be minimally processed on read and write
  2. It should be possible to read an image and immediately write it out in the same format, with the output having semantically identical metadata to the original.
  3. It should be drop-in compatible with the existing image interfaces.

Point 1 means that PNG's three key/value stores will be exposed as three separate stores, and can't be a plain string/string map. It also means that code which wishes to pull information out of a PNG needs to know to look in both the XMP and eXif data.

Point 2 means we need to at least record any metadata that we don't currently know how to interpret so it can be written back out, and the images that are returned from decoding an image need to have its metadata silently attached.

Point 3 means we can't add methods to existing interfaces, since it's possible there are other packages that implement them. It also means we can't alter the calling conventions of RegisterFormat() in image.

This proposal should make it possible for third-party image libraries to participate in the new standard if they choose.

This proposal has some decisions which are either potentially controversial or potentially fragile. The ones I can think of are noted in the Potential Issues section

NOTE: "Metadata" means any data not directly related to the pixels in an image on disk. For example PPI and rotation, for example, are metadata, while alpha channel is not. (Rotation is arguably pixel data the way that alpha channel is, but since it massively affects the translation of disk pixels to image pixels we'll classify it as metadata for now)

Changes to current packages

Some of the standard packages will be changed. The changes are backwards-compatible.

image

  • New registration function RegisterFormatExtended
  type Format struct {
    Name           string
    MagicString    string
    Decode         func(io.Reader) (Image, error)
    DecodeConfig   func(io.Reader) (Image, error)
    DecodeMetadata func(io.Reader) (*metadata.Metadata, error)
    GetMetadata    func(Image) (applies bool, *metadata.Metadata, error)
  }

  RegisterFormatExtended(*Format)

The new registration format includes a function to decode the metatada, as well as a format-specific function to return the parsed metadata for an image.

  • RegisterFormat marked deprecated

It will still work, but internally will just construct a Format struct and call RegisterFormatExtended.

  • GetMetadata added as registered function

This function will be passed an Image. Because there's no way to tell what type the image is, the metadata getting code must iterate through each format's get function. The function sets applies to false if it doesn't apply. Iteration short-circuits at the first function that says it does apply.

image/gif

  • Metadata type, and supporting types, added
type Metadata struct {
  Comment     []string
  XMPMetadata *xmp.XMP
  AppBlocks   []AppBlock
}  

type AppBlock struct {
  AppID [8]byte
  AppAuthCode [3]byte
  BlockData []byte
}
  • Metadata added to Options
type Options struct {
    NumColors int
    Quantizer draw.Quantizer
    Drawer    draw.Drawer
    Metadata  *Metadata // Added
}

image/png

  • Metadata added. Note that for the initial proposal not all the chunks are proposed to be exported. Any chunk the library can't process is stored in the unexported field, which is only accessible to the library's reader.
type Metadata struct {
  KVText           []struct{string, string}
  CompressedKVText []struct{string, string}
  UnicodeKVText    []struct{string, string}
  ChangeTime       *time.Time
  XMPMetadata      *xmp.XMP
  ExifMetadata     *exif.Exif
  unexported       map[string][]byte // For currently uninterpreted chunks
}
  • Encode function gets optional Options parameter:
func Encode(w io.Writer, m image.Image, o *Options) error

type Options {
  Metadata *Metadata
}

image/jpeg

  • Metadata added
type Metadata struct {
  ExifMetadata *exif.Exif
  XMPMetadata  *xmp.XMP
  unexported []byte // For other uninterpreted data
}
  • Metadata added to Options struct
type Options struct {
    Quality  int
    Metadata *Metadata
}

New packages

We add a new general-purpose metadata package to hold the metadata get function. We also add packages for the two shared metadata formats, XMP and eXif.

images/metadata

  interface Metadata {
    // Make sure the Metadata struct is actually valid, as each format has
    // its own quirks and restrictions.
    Validate() error
    // Here mostly to disambiguate from all the other interfaces with just
    // Validate in them.
    IsImageLibraryMetadata()
  }

  // Returns the filetype-specific metadata. Type-switch to tell what kind.
  Get(image.Image) (*Metadata, error)

Get will internally iterate through the registered GetMetadata functions until it finds one that matches.

The Validate method allows code to make sure the metadata struct is valid for the format before writing it out. This is important since many of the fields have quirks and limitations -- for example the key in png's key/value store must be ISO 8859-1 and between 1 and 79 characters.

images/metadata/exif

The exif package decodes metadata in eXif format.

  type Exif {
    // Handwave on the internals until the proposal is generally deemed OK
  }

  // Decode a binary blob, without any leading or trailing markers, as exif
  func Decode([]byte) (*Exif, error)

images/metadata/xmp

The xmp package decodes metadata in XMP format. (Or, if you prefer, ISO 16684-1:2019 as it's an official ISO standard)

  type XMP {
    // Handwave on the internals until the proposal is generally deemed OK
  }

  // Decode a binary blob, without leading or trailing markers, as xmp
  func Decode([]byte) (*XMP, error)

Potential Issues

The proposal has some decisions which are either potentially controversial or potentially fragile. They are noted

Iterating through metadata get functions is fragile

I can't think of a better way, without having the libraries tag the returned Image with a type, which the API doesn't support that I can see.

Adding an optional *Options parameter to png.Encode

This has to be done by actually adding a ...*Options parameter to the signature and then complaining if multiple ones are passed. This requires runtime signature validation, which isn't great

Adding *Metadata to existing Options parameters may cause unexpected behaviour

jpeg.Options has an integer quality parameter, so it can't actually be left off. We'd have to interpret 0 == default. gif.Options has an integer color count parameter, and we'd have to interpret 0 == default.

No access to uninterpreted metadata fields and chunks

This is intentional. It means that writers can't add their own chunks to the output, and it means that readers can't implement their own interpretation code. This is definitely inconvenient, but it means that when the additional chunks are interpreted by the standard library there won't potentially be two ways to access the same data.

Once the implementation is deemed complete then read/write access to these additional chunks shoud be added.

Optional metadata entites are stored as pointers to things.

This allows us to interpret nil pointers as elided entities and not require a HasFoo functions for each element. Not everyone is fond of this style.

No translation between image format metadata

If you read a jpeg in, then write it as a gif, the metadata isn't translated. This is intentional, since there's currently no clear 1:1 translation between file format metadata information.

@gopherbot gopherbot added this to the Proposal milestone Aug 4, 2019

@gopherbot gopherbot added the Proposal label Aug 4, 2019

@odeke-em odeke-em changed the title Proposal: Adding metadata support to images/{jpeg, gif,png} Proposal: add metadata support to images/{jpeg, gif,png} Aug 4, 2019

@odeke-em

This comment has been minimized.

Copy link
Member

commented Aug 4, 2019

Thank you for this proposal @drswork and welcome to the Go project!

We shall take a look at this perhaps during Go1.14, and I'll also kindly ping our image expert @nigeltao.

@odeke-em odeke-em changed the title Proposal: add metadata support to images/{jpeg, gif,png} Proposal: add metadata support to image/{jpeg, gif,png} Aug 4, 2019

@odeke-em odeke-em changed the title Proposal: add metadata support to image/{jpeg, gif,png} proposal: add metadata support to image/{jpeg, gif,png} Aug 4, 2019

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 5, 2019

Sure. Just to be clear, I'm not asking anyone else to implement this -- I'm perfectly happy to do the work. I just want to make sure that the API and semantics are mostly correct before starting anything.

@smasher164

This comment has been minimized.

Copy link
Member

commented Aug 5, 2019

This may also enable a general solution to #11420 and #20613, since color-space information is also metadata.

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 6, 2019

I fully expect to surface the color space information as metadata. Actually applying it to the image is well beyond my image manipulation skills so I'll leave that part to someone else.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

/cc @nigeltao

@rsc rsc changed the title proposal: add metadata support to image/{jpeg, gif,png} proposal: image: add generic metadata support for jpeg, gif, png Aug 6, 2019

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2019

Thanks for the detailed write-up. A few quick observations:

  • ICCP metadata isn't mentioned?

  • I don't understand how Format.GetMetadata is supposed to work. Is it for decoding, encoding, or both? If it's decoding, Decode will return you e.g. an *image.RGBA. How are you going to squeeze some metadata out of an *image.RGBA?

Similarly, you say:

Get will internally iterate through the registered GetMetadata functions until it finds one that matches.

but Get takes an image.Image. How does a png.Format or a jpeg.Format know that it applies to an *image.Gray?

  • Is there no way to decode both the image and metadata in one pass, without having to rewind the io.Reader (which might require arbitrarily large buffering)? Yes, DecodeConfig doesn't let you do this either, but in hindsight, it might have been nice for DecodeConfig to return some sort of way to continue a partial decoding without having to rewind the io.Reader.

  • You also say:

Each image format has its own metadata.

And they also implement metadata.Metadata (right?), but if I have a value of type metadata.Metadata, is there a good way to ask it for its Exif (if it has one)?

  • Do you need a Metadata.Validate method? It's not like you're going to pass a jpeg.Metadata to png.Encode, right? It seems like you could have a (non-exported) validate(*Metadata) function in the png package, which only took a png.Metadata.

  • You also say:

Adding an optional *Options parameter to png.Encode

That'll break any existing code that looks like:

var encode func(io.Writer, image.Image) error = png.Encode


In general, there's a lot of promising ideas here, but I also think that landing on the right design will take longer than e.g. 3 months of iterative feedback, and 3 months is the "feature development" part of the 6-monthly Go release cycle (https://github.com/golang/go/wiki/Go-Release-Cycle). I'm hesitant to add something in e.g. the 1.14 release cycle that, 9 months later, we realize could work much better with a different API design, but by then we're then stuck by backwards compatibility constraints.

For example, if an encoded image contains megabytes of some flavor of metadata, but all we want is an EXIF orientation, will we still have to decode all of the metadata into an in-memory representation up front? It's not exactly the same, but there might be some food for API thought from #5465 "The Request struct... has all its fields publicly exposed, most of which require memory allocation".

I'd advise forking the stdlib's image/* packages for now, and adding these new features there. Once we get some good iterations and usage on that one, then we can consider adding stdlib API that we're going to have to support for many years.

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 14, 2019

ICCP metadata isn't mentioned?

Yeah, at the time I wrote the proposal I wasn't aware of any of the details of ICCP. Now that I am (I gave the spec a scan over the weekend after writing a bunch of png chunk decoders) I admit I'm inclined to punt and surface it as a binary blob, leaving it to someone else to decode.

While on the one hand it's not great since it isn't decoded, on the other at least the data's surfaced so it can be decoded, so it's a step up from the current state of things.

I don't understand how Format.GetMetadata is supposed to work. Is it for decoding, encoding, or both? If it's decoding, Decode will return you e.g. an *image.RGBA. How are you going to squeeze some metadata out of an *image.RGBA?

and

but Get takes an image.Image. How does a png.Format or a jpeg.Format know that it applies to an *image.Gray?

This bit of API awkwardness is because image.Image is an interface, so I can't add anything to it without breaking things. (I would've preferred to just hang a Metadata field off image.Image, but alas that's not an option)

My assumption here is that image/{jpeg,gif,png} would hang a real metadata field off whatever concrete type they returned, and when you called .GetMetadata it would query the thing it was handed to see if it had 's metadata hanging off it. That was something I figured was an implementation detail and was handwaving past to focus on the user-facing API, but I should've been a bit more concrete here.

Each image format has its own metadata.

And they also implement metadata.Metadata (right?), but if I have a value of type metadata.Metadata, is there a good way to ask it for its Exif (if it has one)?

Because the three image formats have such... interestingly divergent ideas of what metadata to store and how to store it, I hadn't planned on having the different metadata types be particularly interchangeable. In concrete terms it means that if you're handed a metadata.Metadata there's very little you can do with it without a type switch and casting it to a {png,gif,jpeg}.Metadata.

In the few rare cases where the different formats hold the same data then the fields that hold it should be named the same, but that's a matter of convention more than anything. (Also I'm not sure it's a good idea anyway, since the way the data is stored causes issues. XMP metadata is in both GIF and PNG files, but PNG stores it as a text blob with a particular key in its key/value store and a good case could be made that pulling it out and decoding it hides an essential, if very annoying, property of the data)

Do you need a Metadata.Validate method? It's not like you're going to pass a jpeg.Metadata to png.Encode, right? It seems like you could have a (non-exported) validate(*Metadata) function in the png package, which only took a png.Metadata.

Oh, yes, this is definitely required. It's not to validate that you're handing jpeg metadata to the png encoder, but more that the information in the metadata is valid in the first place. The specs have some firm(ish) ideas about data validation -- for example, in png files it's bogus to have both iCCP and sRGB data, the keys in its key/value store have to be latin-1, and they can't be more than 79 characters. It'd be lovely if each type in the metadata structure was strictly specified such that you can't possibly create bogus metadata, but that seems infeasible.

I am also 100% sure every single data validation requirement (well, maybe not the png key length, though I wouldn't be surprised if that's a good attack vector for some decoders) has been hilariously violated by software out in the wild, and as such I'd actually argue that encode should write out bogus metadata when handed it, since that metadata may well have come from an existing image. I don't think it's a good idea to not be able to write out data we just read (opinions differ here, of course), so encode needs to not complain

Is there no way to decode both the image and metadata in one pass, without having to rewind the io.Reader (which might require arbitrarily large buffering)? Yes, DecodeConfig doesn't let you do this either, but in hindsight, it might have been nice for DecodeConfig to return some sort of way to continue a partial decoding without having to rewind the io.Reader.

It's actually much, much easier to decode the metadata and image in a single pass. Honestly DecodeMetadata is in there mostly because DecodeConfig exists. I assumed there were cases where code only wanted the configuration info and not fully decode the image, and I'm 100% good with dropping it. (I can see a few cases where only wanting the metadata would be useful, but I'm good with requiring folks to decode the image for now and consider adding the API in later)

Adding an optional *Options parameter to png.Encode

That'll break any existing code that looks like:

var encode func(io.Writer, image.Image) error = png.Encode

Bah. That was the least bad option I could come up with. I'm open to alternatives -- adding another encode function, the way image/gif has done, was my Plan B, with some kind of metadata attachment function being the backup plan for that.

In general, there's a lot of promising ideas here, but I also think that landing on the right design will take longer than e.g. 3 months of iterative feedback, and 3 months is the "feature development" part of the 6-monthly Go release cycle (https://github.com/golang/go/wiki/Go-Release-Cycle). I'm hesitant to add something in e.g. the 1.14 release cycle that, 9 months later, we realize could work much better with a different API design, but by then we're then stuck by backwards compatibility constraints.

I fully expect the result will be sub-optimal in some way, sure. APIs always are. (Heck, the current API, with each format having different Encode function signatures, is a clear example there) That's part of the reason why I wrote this up for discussion before starting in on coding, to try and work out a reasonable API.

I have the free time to work on this, though I realize it may not be my time constraints that are the issue for the API design.

For example, if an encoded image contains megabytes of some flavor of metadata, but all we want is an EXIF orientation, will we still have to decode all of the metadata into an in-memory representation up front? It's not exactly the same, but there might be some food for API thought from #5465 "The Request struct... has all its fields publicly exposed, most of which require memory allocation".

I'm sympathetic to lazy decoding, sure. In regards to #5465 is the concern the amount of memory allocated or the number of objects created? Having function calls intermediate access to metadata elements isn't without cost either, though I can certainly see deferring, say, XMP decoding until something actually fetches the XMP data.

This does kinda argue for a DecodeMetadata() call of some sort, FWIW. :)

I'd advise forking the stdlib's image/* packages for now, and adding these new features there. Once we get some good iterations and usage on that one, then we can consider adding stdlib API that we're going to have to support for many years.

I'm unclear here -- are you suggesting making a branch and flddling with image/ in that branch, or actually releasing cloned version of the image packages under a different name and see how that goes?

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 14, 2019

Also, semi-separately, is there a better way to iterate on the API than github issue comments? At work I'd spin up a google doc for this but that doesn't seem viable here. (I suppose I still can, though that shuts most folks out)

@networkimprov

This comment has been minimized.

Copy link

commented Aug 14, 2019

Each major API draft could be a new issue. Or each major feature of the API could be a separate issue.

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2019

My assumption here is that image/{jpeg,gif,png} would hang a real metadata field off whatever concrete type they returned,

So, png.Decode, when passed (PNG-encoded) RGBA data, wouldn't return an *image.RGBA? That might not be a breaking change, from the compiler's point of view, but it's probably a breaking change, in the http://www.hyrumslaw.com sense. For example, this test code assumes that decoding a known PNG-encoded image results in an *image.Gray:

https://github.com/golang/image/blob/cff245a6509b8c4de022d0d5b9037c503c5989d6/webp/decode_test.go#L80
https://github.com/golang/image/blob/cff245a6509b8c4de022d0d5b9037c503c5989d6/webp/decode_test.go#L101

There are undoubtedly similar tests out there in the wild.

Or, for example, you can currently pass the result of a png.Decode call to the image/draw package, whose implementation has optimized code paths for e.g. *image.RGBA sources and destinations. Interposing an intermediate image-with-metadata type will force skipping the optimized path, in favor of a more general but substantially slower path. Sure, you could patch image/draw at the same time that you introduce your metadata code, but it'll still affect all the third party Go code, outside of the stdlib, that does a similar type switch.

In the few rare cases where the different formats hold the same data then the fields that hold it should be named the same,

One possibility is that they all satisfy the same interface{ GetExif() etc }. I'm not saying that it's necessarily a good idea, but it's an idea.

It's not to validate that you're handing jpeg metadata to the png encoder, but more that the information in the metadata is valid in the first place.

I'm not questioning the need to validate, I'm questioning the need for validate to be (1) a upper-case-V (i.e. exported) method, (2) on the interface type (not just on concrete types).

If it's for decoding, not encoding, then png.Decode can call a lower-case-v validate method on its png.Metadata before it returns. Everything's within the package boundary. Or just validate it incrementally, as it's built incrementally.

Another way to put it: who would call the upper-case-V Validate method, on a value of the interface type (metadata.Metadata) instead of a concrete type (png.Metadata)?

adding another encode function, the way image/gif has done, was my Plan B

I think you're going to have to go with Plan B. I guess this reinforces my point that I'm hesitant to commit to API too early. There are some clear mistakes in the stdlib image API, but we can't change them now.

In regards to #5465 is the concern the amount of memory allocated or the number of objects created?

Both? For example, if you look for "sky_with_icc.gif" in the (private) google3 repository, it's a 3MB GIF file for which over 60% of that bulk is an ICC profile. Because the GIF format is... sub-optimal, that 1.8MB profile is cut into over 7000 255-byte chunks. If I want to decode that GIF file in Go, will I be obligated to allocate something (many things?) to hold all of that metadata, even if I don't actually care about it?

are you suggesting making a branch and flddling with image/ in that branch, or actually releasing cloned version of the image packages under a different name

The latter: make a github.com/google/drsimage repository (you can obviously change "drsimage" to whatever you please), and make e.g. import "github.com/google/drsimage/image/png" be a drop-in replacement for import "image/png". Type aliases will undoubtedly be useful here.

is there a better way to iterate on the API than github issue comments?

You could start a new CL to the main go repository (containing the stdlib), with no actual implementation, just API (new types, methods, etc) and, ideally, doc comments. You could add panic's at various points so that it all compiled, if you wished. Send it out for review, noting that it's not for submission, just for API discussion.

Or, if you prefer a google doc, you could make it world-editable.

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 15, 2019

My assumption here is that image/{jpeg,gif,png} would hang a real metadata field off whatever concrete type they returned,
So, png.Decode, when passed (PNG-encoded) RGBA data, wouldn't return an *image.RGBA?

It would, yes. image.RGBA would just get a metadata field attached to it.

In the few rare cases where the different formats hold the same data then the fields that hold it should be named the same,
One possibility is that they all satisfy the same interface{ GetExif() etc }. I'm not saying that it's necessarily a good idea, but it's an idea.

Possibly, though I'd very much prefer to use formal interfaces as little as possible with this since they're impossible to extend in core libraries and I fully expect that there will be things that are missed in the first version of this. I was more thinking just naming the struct fields the same, though having accessors for the expensive-to-decode fields (I'm specifically thinking of the XMP stuff because it's XML-encoded) and deferring decoding is reasonable.

It's not to validate that you're handing jpeg metadata to the png encoder, but more that the information in the metadata is valid in the first place.
I'm not questioning the need to validate, I'm questioning the need for validate to be (1) a upper-case-V (i.e. exported) method, (2) on the interface type (not just on concrete types).
[snip]
Another way to put it: who would call the upper-case-V Validate method, on a value of the interface type (metadata.Metadata) instead of a concrete type (png.Metadata)?

Fair point. My thinking here was that:

  1. The base image types would get a metadata field attached to them
  2. That field should be more strongly typed than interface{}
  3. Interfaces with a single dummy entry for type-enforcement are kinda silly

Since the concrete types would all have Validate methods it made a certain amount of sense to hoist it up into the interface to address point 3. I'm not particularly strongly wedded to that one, though.

adding another encode function, the way image/gif has done, was my Plan B
I think you're going to have to go with Plan B. I guess this reinforces my point that I'm hesitant to commit to API too early. There are some clear mistakes in the stdlib image API, but we can't change them now.

If I'm going to be touching all this stuff anyway then is it worth regularizing the existing library API? (which I realize means "adding new functions" rather than changing the existing ones) I can see that as reasonable, though perhaps premature since image/gif has some additional issues beyond just inconsistencies with Encode's parameter list.

In regards to #5465 is the concern the amount of memory allocated or the number of objects created?
Both? For example, if you look for "sky_with_icc.gif" in the (private) google3 repository, it's a 3MB GIF file for which over 60% of that bulk is an ICC profile. Because the GIF format is... sub-optimal, that 1.8MB profile is cut into over 7000 255-byte chunks. If I want to decode that GIF file in Go, will I be obligated to allocate something (many things?) to hold all of that metadata, even if I don't actually care about it?

Since you can't properly interpret the bits of the image without it, you're kind of obliged to?

Past that, I suppose the underlying question is "should the default image decoder fully decode the image?" The metadata is part of the image, and currently the decoder doesn't touch it. I think it should, but on reflection I think the image libraries should also take the image-affecting metadata into account (including, for example, the gamma and rotation data) when decoding and manipulating the images and that's definitely not currently happening, and making decode properly respect those would definitely be a semantically breaking change.

This does argue for separate metadata-aware encoding and decoding functions.

Re: API iteration options:

  1. Cloned image libraries
  2. Dummy CL with API/doc proposal
  3. World-readable google doc

Since I apparently can't get docs to share beyond google.com (I assume there's a switch, I just don't know what it is), and I'm not comfortable tossing up code on GitHub that'll potentially end up abandoned, I'll go with option 2 here.

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2019

Interfaces with a single dummy entry for type-enforcement are kinda silly

Maybe. IIRC protobuf messages have those. There might be other precedent. I started a discussion at https://groups.google.com/forum/#!topic/golang-dev/aRvnYIcaIaA

If I'm going to be touching all this stuff anyway then is it worth regularizing the existing library API?

Quite possibly, yes.

  1. Cloned image libraries
  2. Dummy CL with API/doc proposal
  3. World-readable google doc

... I'm not comfortable tossing up code on GitHub that'll potentially end up abandoned, I'll go with option 2 here.

To be clear, I'm perfectly happy to go with option 2 in terms of discussing API. In terms of landing code, I would still strongly prefer to avoid working directly in the stdlib to start with, unless absolutely necessary. I think this will take more than one 6 month cycle to iterate to something great, and as I've said before, I don't want to prematurely commit to an API we'd have to support for a long time.

Thus, I strongly prefer that this work happens in either a fork (e.g. new GitHub repo) or a branch. I just think that a fork would be easier to work with (e.g. you can "go get" a GitHub repo), but I don't have much experience with working on a branch of the main Go repo, and it might be easier than I fear.

I wouldn't think of it as code that will be abandoned. I would think of it as an experimental branch. It's just that, as I said, I think telling someone "go get aSeparateRepo" to try this out would be easier than to have them build Go from source, but from a branch.

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2019

Instead of github.com/google/drsimage/image/png, another option is to make an x/imagemetadata sub-repo, so the import path for the experimental code would be golang.org/x/imagemetadata/image/png. OTOH, that might be confused with the existing x/image sub-repo.

It's not necessarily a good idea, just an idea.

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2019

If I'm going to be touching all this stuff anyway then is it worth regularizing the existing library API?

If we're making new API, another possibility (in e.g. package png) is
DecodeImage(io.Reader) (Image, Metadata, error)
instead of somehow augmenting (in a back-compat way)
Decode(io.Reader) (Image, error)

That way, we don't need to add anything to e.g. the image.RGBA types. We could also possibly add an options argument, or make it a method, not a function.

Again, not necessarily a good idea, just an idea.

@bcmills

This comment has been minimized.

Copy link
Member

commented Aug 16, 2019

If the Metadata interface doesn't actually have anything in common between implementations, the DecodeImage alternative above seems best: then it can return a concrete type — with well-documented methods and fields! — instead of an empty abstraction.

package png

type Metadata struct {
	[…]  // Fields specific to PNG metadata!
}

func DecodeImage(io.Reader) (image.Image, *Metadata, error)
@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2019

If the Metadata interface doesn't actually have anything in common between implementations

Well, it'd be nice if decoding a JPEG and re-encoding to PNG could preserve e.g. EXIF metadata.

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2019

Another thing that'd be nice is if image.Decode, not just png.Decode, returned metadata that we could inspect if it contained EXIF information.

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 17, 2019

The one thing I'd prefer to avoid is having the decoded metadata be disconnected from the associated image. While returning it separately as well can be useful, most of the time the metadata you get from an image is specific to that image and so really ought to be attached to the image. Otherwise it makes keeping the image and its associated metadata together much more of a pain (and, indeed, couldn't be done transparently with existing code) if they're two independent bits of data that have to be passed as a pair.

@bcmills

This comment has been minimized.

Copy link
Member

commented Aug 19, 2019

@drswork, note that you could have the metadata refer to its corresponding image (rather than vice-versa). Then you'd still only have one item to pass around (the metadata), but you'd have proper type information for it (exactly which kind of metadata it is, and possibly for exactly which kind of image as well).

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

Oh, if we're minting new image API... please bear in mind #27830 and its links to even older issues.

@drswork

This comment has been minimized.

Copy link
Author

commented Aug 23, 2019

There are some interesting ideas in #27830. I was unaware of it, but now I know and it mirrors some of what I was contemplating so I'll keep that in mind while I'm sketching things out this weekend.

I am half tempted to revisit how image/gif handles animated gifs by default, but I'm not sure if that's something I want to get into.

@nigeltao

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

Yeah, I'm also not happy about animated GIFs in hindsight (e.g. requiring the whole animation to be in-memory; being probably too low-level in giving Disposal enum values but not actually implementing them for you). But mission creep (e.g. "define an animation/video API") is also a problem. As the scope grows, it will definitely take more than one 6 month cycle to iterate to something great.

FWIW, I have a long term goal for an "image 2.0" API for Go, but progress is very slow. It's not something I'm working on week to week, or even month to month.

In any case, for "image 2.0", (1) I'd probably make a new git repo instead of augmenting the stdlib, and (2) as I allluded to in #27830, I'd base it on https://github.com/google/wuffs and have Wuffs generate Go code, not just C code. For example, one benchmark measures Wuffs' GIF decoder at 3x faster than Go's image/gif, partly because I don't really use the io.Reader interface directly (look for Transformer in the #27830 discussion).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.