New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media type rework #1770

Merged
merged 11 commits into from Apr 25, 2018

Conversation

Projects
None yet
4 participants
@cquiroz
Member

cquiroz commented Apr 5, 2018

This is fairly large PR that refactors MediaType making it more compatible with the current style, using an auto-generated list of mime types and including typeclasses for MediaType.

Some notes:

  • This goes to master as it is binary incompatible
  • This finally gets rid of Registry!
  • The list of MediaTypes is now auto-generated from MimeDB. The file MimeDB.scala contains that list on a form amenable to http4s. This fixes #73
  • MimeDB.scala is generated with a utility at https://github.com/cquiroz/mime-http4s-generator. This should likely be transferred to the http4s organization.
  • Note that I'm checking in MimeDB.scala. I thought about making it part of the build but given mime-http4s-generator use http4s-client and circe leading to conflicts. Also, I think it's better to have a stable checked in version rather than risking the build to change when the db is modified.
  • I've reduced the amount of MimeTypes included in the object MimeType to the minimal to allow compilation. This may break some user code
  • MediaRange and MediaType are no longer Renderable
  • Note that users can't add MediaTypes any longer though you can manually create anything you need. This only could affect serving files for unknown file types
}
implicit val http4sInstancesForMediaRange
: Show[MediaRange] with HttpCodec[MediaRange] with Order[MediaRange] =
new Show[MediaRange] with HttpCodec[MediaRange] with Order[MediaRange] {
override def show(s: MediaRange): String = s.toString
override def show(s: MediaRange): String =
s"${s.mainType}/*${MediaRange.extensionsToString(s)}"

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

I've made show match whats produced by renderrather than matchtoString. this made it easier to remove Renderable` from the hierarchy

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

Interesting. This is something we should be consistent on throughout the model: do we want show to match the wire representation and be part of the HttpCodec laws?

override def toString: String = "MediaType(" + renderString + ')'
}
object MediaType extends Registry {

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

Bye, bye Registry. But also registerFileExtension is gone

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

It won't be missed.

app("msword", uncompressible, binary, "doc", "dot", "w6w", "wiz", "word", "wri")
val `application/octet-stream` = app(
/////////////////////////// PREDEFINED MEDIA-TYPE DEFINITION ////////////////////////////
// Copied from the definitions on MimeDB

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

Predefined MediaTypes are copied verbatim from MimeDB

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

I don't understand what's here vs. in MimeDB.

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

I struggled with this a bit. We had several MediaTypes pinned here and now those are in MimeDB. There are a few alternatives:

  1. We remove these altogether and make callers do e.g. MimeDB.text.text/plain
  2. In MediaType we create types like text/plain and link them to MimeDB.text.text/plain
  3. Or we copy the definitions from MimeDB to here (The option used)

Ideally, we'd use one of the first options but it gets a bit complex with application. There are a few thousand application mime types, and given the JVM code size limits it needs to be split on the code generation part. e.g. application/xml is MimeDB.application.application_2.application/xml. Even worse the numbering is not fixed as the mimedb can change so an update of MimeDB could break code. In option 1 that's not acceptable as we'd require users to change theyr code. option 2 maybe better as we'd need to update the links when MimeDB changes anyway.

I'm not heavily invested in option 3. we could switch to option 2 if you think is better

@@ -0,0 +1,6700 @@
package org.http4s
private[http4s] object MimeDB {

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

MimeDB is checked in rather than auto generated during the build. This means builds are stable but we could be stale with respect to updates on the upstream db.
There are pros/cons about the alternatives:

Generated on build:

  • Pros:
  1. Always up to date
  2. We don't need to check in this large file
  • Cons:
  1. Difficult to make the generator using http4s/circe without producing conflicts on the build
  2. The build could break due to upstream changes
  3. Cannot build it offline (Though in my experiments the downloaded file was cached)

Checked in:

  • Pros:
  1. Stable builds and we can safely refer to MimeDB
  2. This large file would totally kill a Scala.js version. If we check this in we can have a MimeDB for scala.js which is essentially empty
  • Cons:
  1. MimeDB could get stale
  2. Eventually somebody could edit this

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

I dislike checking in generated code, but I'm a big fan of repeatable builds. A compromise is a generator pinned to a git hash. But that complicates the build, and makes keeping it up to date a chore, so maybe it's the worst of both worlds.

I agree with your pros and cons and don't have a strong opinion. Would love to hear from others here.

This comment has been minimized.

@jmcardon

jmcardon Apr 5, 2018

Member

is this every build, or triggered by a command?

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

At the moment it is just created with a separate utility and manually added to Http4s repo and formatted

}
object application {
lazy val all
: List[MediaType] = Nil ::: application.all ::: application_1.all ::: application_2.all

This comment has been minimized.

@cquiroz

cquiroz Apr 5, 2018

Member

This is slightly convoluted. Given that there are several thousand application/ mime types the classes get too large and we hit the JVM 64kb limit. Splitting them this way lets us workaround that error

@rossabaker

This is neat. I like the gist of it.

}
implicit val http4sInstancesForMediaRange
: Show[MediaRange] with HttpCodec[MediaRange] with Order[MediaRange] =
new Show[MediaRange] with HttpCodec[MediaRange] with Order[MediaRange] {
override def show(s: MediaRange): String = s.toString
override def show(s: MediaRange): String =
s"${s.mainType}/*${MediaRange.extensionsToString(s)}"

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

Interesting. This is something we should be consistent on throughout the model: do we want show to match the wire representation and be part of the HttpCodec laws?

override def toString: String = "MediaType(" + renderString + ')'
}
object MediaType extends Registry {

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

It won't be missed.

app("msword", uncompressible, binary, "doc", "dot", "w6w", "wiz", "word", "wri")
val `application/octet-stream` = app(
/////////////////////////// PREDEFINED MEDIA-TYPE DEFINITION ////////////////////////////
// Copied from the definitions on MimeDB

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

I don't understand what's here vs. in MimeDB.

@@ -0,0 +1,6700 @@
package org.http4s
private[http4s] object MimeDB {

This comment has been minimized.

@rossabaker

rossabaker Apr 5, 2018

Member

I dislike checking in generated code, but I'm a big fan of repeatable builds. A compromise is a generator pinned to a git hash. But that complicates the build, and makes keeping it up to date a chore, so maybe it's the worst of both worlds.

I agree with your pros and cons and don't have a strong opinion. Would love to hear from others here.

@aeons

This comment has been minimized.

Member

aeons commented Apr 6, 2018

How often does MimeDB change?

In general I agree with @rossabaker wrt checking in generated code, but for something like this I can definitely see the point in not rebuilding it on every build.

So +1 from me on having an "offline" build and checking in the generated code.

@rossabaker

This comment has been minimized.

Member

rossabaker commented Apr 6, 2018

I count six non-build changes so far this year in the upstream repo.

@cquiroz

This comment has been minimized.

Member

cquiroz commented Apr 6, 2018

That's not much and I'd think only a few new items appear. for the most part MIME types are well known.
I think I want to move the generation tool to http4s as a project and that would let you easily rebuild it. I don't know how easy it is but it should be just another project

private def uncompressible = false
private def binary = true
private def notBinary = false
lazy val `application/javascript`: MediaType = MimeDB.application.`application/javascript`

This comment has been minimized.

@cquiroz

cquiroz Apr 7, 2018

Member

With the improved structure of MimeDB I feel ok about linking directly here

@@ -6,6 +6,30 @@ import org.scalacheck.Prop
import scala.util.Success
package object testing {
// Media types used for testing
// Copied from the definitions on MimeDB
val `text/css`: MediaType = MimeDB.text.`text/css`

This comment has been minimized.

@cquiroz

cquiroz Apr 7, 2018

Member

Do you think this is a good place to put these definitions?

// Autogenerated file, don't edit
package org.http4s
private[http4s] object MimeDB {

This comment has been minimized.

@cquiroz

cquiroz Apr 7, 2018

Member

Note I've made MimeDB private. It is the best for MiMa but could make like harder for clients that want to refer to the mime types. WDYT?

This comment has been minimized.

@rossabaker

rossabaker Apr 11, 2018

Member

I guess I'm still finding it a little bit weird that these aren't just mixed into the MediaType companion. What's the disadvantage to that?

This comment has been minimized.

@cquiroz

cquiroz Apr 11, 2018

Member

do you mean object MediaType extends MimeDB?
That would be possible. I'll check the changes needed for that

This comment has been minimized.

@rossabaker

rossabaker Apr 11, 2018

Member

Yes, that's exactly what I mean.

Maybe this is a bad idea, because it will lead to thousands of members in a common type. It might be miserable for people doing autocompletes. But it's where I'd look for them.

@cquiroz cquiroz force-pushed the cquiroz:media-type-immutable branch from 5c30ac3 to 4e5f9e6 Apr 11, 2018

@cquiroz

Latest changes

val updated = current.updated(lcExt, mediaType)
if (!extensionMap.compareAndSet(current, updated)) registerFileExtension(ext, mediaType)
}
object MediaType extends MimeDB {

This comment has been minimized.

@cquiroz

cquiroz Apr 11, 2018

Member

Now MediaType extends MimeDB, thus allowing to write things like MediaType.text.text/plain

super.register(key, mediaType)
mediaType.fileExtensions.foreach(registerFileExtension(_, mediaType))
mediaType
def multipartType(subType: String, boundary: Option[String] = None): MediaType = {

This comment has been minimized.

@cquiroz

cquiroz Apr 11, 2018

Member

I had to rename this to multipartType to avoid a collision with MimeDB.multipart

}
def forExtension(ext: String): Option[MediaType] = extensionMap.get.get(ext.toLowerCase)
// Curiously text/event-stream isn't included in MimeDB
lazy val `text/event-stream` = new MediaType("text", "event-stream")

This comment has been minimized.

@cquiroz

cquiroz Apr 11, 2018

Member

This is a sore point. Should I make the generator create this manually?

@cquiroz cquiroz force-pushed the cquiroz:media-type-immutable branch from 4e5f9e6 to 58dbaca Apr 16, 2018

@rossabaker

This comment has been minimized.

Member

rossabaker commented Apr 16, 2018

I hate to be a pain, but do we want:

MediaType.audio.`audio/ogg`

or

MediaType.audio.ogg

I think I like the latter.

@cquiroz

This comment has been minimized.

Member

cquiroz commented Apr 21, 2018

How could we move this forward?

@rossabaker

This comment has been minimized.

Member

rossabaker commented Apr 22, 2018

I think it's ready, but ties into #1797: this is a big change, and we might want to defer big changes until 1.0, but master is a mix of small changes and big changes.

Where can we put this and not drive ourselves crazy maintaining three branches?

@cquiroz

This comment has been minimized.

Member

cquiroz commented Apr 22, 2018

Sounds good, let’s hold until the right time is due

@cquiroz cquiroz referenced this pull request Apr 24, 2018

Open

WIP Don't merge. Scala.js compilation #1801

4 of 5 tasks complete
@rossabaker

This comment has been minimized.

Member

rossabaker commented Apr 25, 2018

We decided to press forward with master. Merging.

@rossabaker rossabaker merged commit 0c42002 into http4s:master Apr 25, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@cquiroz cquiroz deleted the cquiroz:media-type-immutable branch Apr 25, 2018

@cquiroz

This comment has been minimized.

Member

cquiroz commented Apr 25, 2018

Great! I'll add the generator is another PR to let anybody update MimeDB

cquiroz added a commit to cquiroz/http4s that referenced this pull request Apr 25, 2018

@cquiroz cquiroz referenced this pull request Apr 26, 2018

Merged

MimeDB generator #1809

cquiroz added a commit to cquiroz/http4s that referenced this pull request Apr 26, 2018

cquiroz added a commit to cquiroz/http4s that referenced this pull request May 3, 2018

rossabaker added a commit that referenced this pull request May 30, 2018

cquiroz added a commit to cquiroz/http4s that referenced this pull request Jul 17, 2018

cquiroz added a commit to cquiroz/http4s that referenced this pull request Aug 10, 2018

cquiroz added a commit to cquiroz/http4s that referenced this pull request Nov 5, 2018

cquiroz added a commit to cquiroz/http4s that referenced this pull request Nov 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment