New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Cross-plat Server-side Image Manipulation Library #2020

Closed
kendrahavens opened this Issue Jun 12, 2015 · 40 comments

Comments

Projects
None yet
@kendrahavens

kendrahavens commented Jun 12, 2015

Proposal for Cross-plat Server-side Image Manipulation Library

The .NET Team and many developers in the community would like a graphics API for .NET Core and so we are starting to work on this. Of course, "graphics API" is a very broad term. To narrow down the scope we are looking at these main needs:

  1. Many developers want to target .NET Core, but rely on Framework APIs like System.Drawing. We'd like to make this easier.
  2. .NET doesn't have any appropriate image manipulation APIs for server-side apps. We want to improve server-side image processing.
  3. We'd like to enable .NET developers to reach multiple platforms with apps that manipulate images.

To address these needs, we plan to start experimenting with a cross-plat server-side image manipulation library.

Goals:

  1. Fill the gap of image manipulation in .NET Core
  2. Focus on server and cloud scenarios
  3. Support Linux, Mac and Windows
  4. Support image resizing and basic drawing operations
  5. Provide a performant solution
  6. Provide a well-designed API that can also support client scenarios in the future

In the next few weeks we will have some .NET summer interns start prototyping parts of this library so we are excited to get started!

Options for Cross-plat implementation:

  1. .NET wrapper around a native cross-platform implementation
    • Pre-existing libraries make providing a wrapper more feasible and have known behavior
    • We don't need to own the implementation
  2. .NET wrapper around APIs provided by the operating systems
    • Has many of the same benefits of the first option
    • Not all versions of all operating systems will likely have the functionality required
    • Likely to result in different behaviors across different operating systems
  3. Fully Managed Implementation
    • Not practical as it requires us to write something from scratch

Option 1 is currently our preferred option. There are many libraries for cross-plat server-side image manipulation that we have been investigating. OpenGL based libraries and libGD both looked promising. For this project we are starting to think libGD is the best place to start.

We have already had some great feedback from the community on primitive drawing types which we expect to use for this library. Now, we are ready for feedback on the image manipulation library. The first scenarios we are looking into are thumbnailing and watermarking. Please provide feedback on the goals, options for implementation, native image manipulation libraries we could use, and scenarios that we should address.

@daniel-kun

This comment has been minimized.

Show comment
Hide comment
@daniel-kun

daniel-kun Jun 12, 2015

How about using Anti-Grain as a backend? That might result in less "sloppy" drawing than WPF does :-)

daniel-kun commented Jun 12, 2015

How about using Anti-Grain as a backend? That might result in less "sloppy" drawing than WPF does :-)

@StephenCleary

This comment has been minimized.

Show comment
Hide comment
@StephenCleary

StephenCleary Jun 12, 2015

Contributor

I think this is a great idea! Instead of "server-side", though, I'd say "non-UI-affine" or something like that. It would be awesome if this same API was available to non-UI threads on client apps.

Contributor

StephenCleary commented Jun 12, 2015

I think this is a great idea! Instead of "server-side", though, I'd say "non-UI-affine" or something like that. It would be awesome if this same API was available to non-UI threads on client apps.

@malekpour

This comment has been minimized.

Show comment
Hide comment
@malekpour

malekpour Jun 12, 2015

I hope this will become a completely separate project. Developers will pick that or any other alternative based on their preferences.

malekpour commented Jun 12, 2015

I hope this will become a completely separate project. Developers will pick that or any other alternative based on their preferences.

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Jun 12, 2015

Member

I think @nathanaeljones has some ideas for this :)

Member

akoeplinger commented Jun 12, 2015

I think @nathanaeljones has some ideas for this :)

@dsplaisted

This comment has been minimized.

Show comment
Hide comment
@dsplaisted

dsplaisted Jun 12, 2015

Member

@StephenCleary wrote:

I think this is a great idea! Instead of "server-side", though, I'd say "non-UI-affine" or something like that. It would be awesome if this same API was available to non-UI threads on client apps.

"Server-side" helps define what we're initially focusing on, both in scenarios and implementation. Unless we run into something unexpected, it should also work on or off the UI thread in client apps. Also note that goal 6 is that we design something that can also support client scenarios in the future.

Member

dsplaisted commented Jun 12, 2015

@StephenCleary wrote:

I think this is a great idea! Instead of "server-side", though, I'd say "non-UI-affine" or something like that. It would be awesome if this same API was available to non-UI threads on client apps.

"Server-side" helps define what we're initially focusing on, both in scenarios and implementation. Unless we run into something unexpected, it should also work on or off the UI thread in client apps. Also note that goal 6 is that we design something that can also support client scenarios in the future.

@kendrahavens

This comment has been minimized.

Show comment
Hide comment
@kendrahavens

kendrahavens Jun 12, 2015

@akoeplinger We are definitely keeping Nathanael's input in mind. :)

kendrahavens commented Jun 12, 2015

@akoeplinger We are definitely keeping Nathanael's input in mind. :)

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 12, 2015

Direct link to specific notes about building this kind of library.

LibGD is certainly the closest fit, but it needs lots of work. I do not believe it is possible to wrap LibGD 2.1 as-is; you would have to fork LibGD into a completely incompatible API in order to implement error handling (as well as to fix some hard design flaws).

Once you've done that, you've essentially committed to maintaining your own library. I know that the maintainers plan for LibGD 3 to be much closer to our needs, but nobody is funding work in this direction, and it's a helluva lot of work.

The path of least resistance is to create a very focused C library that only implements operations that are very fast and have predictable performance. Your managed wrapper (and LibGD) then consume this new API for all core processing needs. All vector drawing, font parsing/rendering should be segmented as a plugin that wraps Cairo. Png, jpeg, and gif support should be built in, but every other format should be a plugin. This way users can opt-in to high-risk features like TIFF parsing and font rendering, which have a terrible security record on every platform.

Platform APIs aren't even consistent enough between versions of Windows to be considered here. Also, everything is broken in this context, and it can't be fixed. This library has to be app-local versioned all the way down, or you're creating ultimate misery and security nightmares.

Naive C++ implementations are typically orders of magnitude too slow for real-time. I think a 100% managed code implementation is unrealistic. And if it doesn't need to be real-time, stick it in a queue for a linux box to deal with. ImageMagick is orders of magnitude slower than my algorithms, but it will get the job done fine in a queue. And libvips (despite no transparency support) is quite fast - fast enough that it can beat any managed imaging even if you include I/O, queue time, datacenter-local networking, etc.

Basically, this C library needs to be good, really good, or it won't have any reason for existence outside of the .NET community, and that would bode ill for its future health.

lilith commented Jun 12, 2015

Direct link to specific notes about building this kind of library.

LibGD is certainly the closest fit, but it needs lots of work. I do not believe it is possible to wrap LibGD 2.1 as-is; you would have to fork LibGD into a completely incompatible API in order to implement error handling (as well as to fix some hard design flaws).

Once you've done that, you've essentially committed to maintaining your own library. I know that the maintainers plan for LibGD 3 to be much closer to our needs, but nobody is funding work in this direction, and it's a helluva lot of work.

The path of least resistance is to create a very focused C library that only implements operations that are very fast and have predictable performance. Your managed wrapper (and LibGD) then consume this new API for all core processing needs. All vector drawing, font parsing/rendering should be segmented as a plugin that wraps Cairo. Png, jpeg, and gif support should be built in, but every other format should be a plugin. This way users can opt-in to high-risk features like TIFF parsing and font rendering, which have a terrible security record on every platform.

Platform APIs aren't even consistent enough between versions of Windows to be considered here. Also, everything is broken in this context, and it can't be fixed. This library has to be app-local versioned all the way down, or you're creating ultimate misery and security nightmares.

Naive C++ implementations are typically orders of magnitude too slow for real-time. I think a 100% managed code implementation is unrealistic. And if it doesn't need to be real-time, stick it in a queue for a linux box to deal with. ImageMagick is orders of magnitude slower than my algorithms, but it will get the job done fine in a queue. And libvips (despite no transparency support) is quite fast - fast enough that it can beat any managed imaging even if you include I/O, queue time, datacenter-local networking, etc.

Basically, this C library needs to be good, really good, or it won't have any reason for existence outside of the .NET community, and that would bode ill for its future health.

@jbattermann

This comment has been minimized.

Show comment
Hide comment
@jbattermann

jbattermann Jun 12, 2015

To add a bird-level feedback here: it would be great if in addition to a potential default set of .Net Core supported image types and manipulations, one should be able to write and plug in custom ones for both - codecs to support more image/media & manipulation types/transformations etc and even replace / override default .Net Core ones if one requires/wants to do so.

Ideally these .Net Core interfaces should also allow to be async/awaitable i.e. to allow transformations to be off-loaded to local GPU driven transformations, external processes or even remote systems/services (for example heavy lifting medical image reconstruction/transformation on remote server clusters).. which ought to be transparent for the local .Net Core application, be it a server or desktop one.

jbattermann commented Jun 12, 2015

To add a bird-level feedback here: it would be great if in addition to a potential default set of .Net Core supported image types and manipulations, one should be able to write and plug in custom ones for both - codecs to support more image/media & manipulation types/transformations etc and even replace / override default .Net Core ones if one requires/wants to do so.

Ideally these .Net Core interfaces should also allow to be async/awaitable i.e. to allow transformations to be off-loaded to local GPU driven transformations, external processes or even remote systems/services (for example heavy lifting medical image reconstruction/transformation on remote server clusters).. which ought to be transparent for the local .Net Core application, be it a server or desktop one.

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 12, 2015

We should separate our high-level API needs from our low-level primitive needs.

At a high level, users will want (or end up creating) both declarative (result-descriptive) and imperative (ordered operation) APIs. People reason about images in a lot of different ways, and if the tool doesn't match their existing mental pattern, they'll create one that does.

At a mid-to-high level, I'd love to see a generic graph-based representation of an image processing workflow. Visual details will change depending on the backend (no two imaging libraries produce the same results), but being able to mix and match libvips, imagemagick, libgd & managed code would be nice. Cons: hard to reason about, complex to work with directly. Multi-dimensional images (TIFF, GIF) add even more trouble. Pros: Easily wrapped as a declarative API, as an imperative API. Can apply advanced optimizations and pick the fastest or best backend depending upon image format/resolution and desired workflow. Given how easily most operations compose, this could easily make the average workflow 3-8x faster.
As much as I would love to see this, a graph API should be the last part designed, as it requires detailed knowledge of all the backends it can support.

From a practical standpoint, it's best to start with the low-level operations, and expose reusable APIs that others can build on top of. We don't want to chase data structure genericity at a low level. For example, if you expose an interface that supports multiple color spaces or bit depths, you implicitly force APIs to support all of those permutations, many of which will make no or little sense. Most compiler optimizations for inner loops only happen when the channel byte count is known ahead of time; this matters more than you'd think.

Key low-level primitives

Color adjustments

  • Convert from arbitrary color space and profile to sRGB
  • sRGB<->Linear functions, on scanline sets at a time (Operations that do any blending of pixels need to operate in linear).
  • Apply gamma/adjust channels independently
  • Color adjustment matrix application

Image analysis

  • Calculate histogram
  • Auto white balance
  • Fast octree quantizer
  • Face detection (cropping heads off selfies to meet an aspect ratio need is uncool). <- tricky to do a compact implementation, data set is heavy.
  • Document type detection (photograph, document, line art, etc).
  • Detect boundaries (sobel filter, edges inward - can be applied locally with tiny alloc req.s)

Operations requiring matrix transposition (which we avoid at all costs)

  • Mathematically correct interpolation, with custom interpolation weighting callback. Cached weights mean this callback will be invoked only (w x h x scale factor) number of times.
  • Generic convolution kernel applicator, with and without thresholds. Size matters; large kernels will drastically affect performance, and this needs to be clearly documented.
  • Rotate 90 degree intervals
  • Performance constant blur (3x box blur approximates gaussian)
  • Performance constant sharpen

Scale, convolve, rotate 90 degrees, blur, and sharpen - can be composed and require a single transposition. Separately they would require 7.

Trivial operations

  • Flip
  • Crop (doesn't require a copy, just stride & pointer adjustment)
  • Create canvas
  • Fill rectangle with solid color
  • Copy (overwrite)
  • Compositing *(ok, not trivial, but easily managed if you lock down color spaces and alpha premultiplication).

You'll note that affine transform/distort is notably absent. Distortion has exponentially bad performance with image size - it's not linear. Large convolution kernels have a similar effect. Distortion is rarely needed and use should be minimized. Cairo's implementation is fine.

On top of these primitives, and combined with existing codecs, we could build a respectable image library.

lilith commented Jun 12, 2015

We should separate our high-level API needs from our low-level primitive needs.

At a high level, users will want (or end up creating) both declarative (result-descriptive) and imperative (ordered operation) APIs. People reason about images in a lot of different ways, and if the tool doesn't match their existing mental pattern, they'll create one that does.

At a mid-to-high level, I'd love to see a generic graph-based representation of an image processing workflow. Visual details will change depending on the backend (no two imaging libraries produce the same results), but being able to mix and match libvips, imagemagick, libgd & managed code would be nice. Cons: hard to reason about, complex to work with directly. Multi-dimensional images (TIFF, GIF) add even more trouble. Pros: Easily wrapped as a declarative API, as an imperative API. Can apply advanced optimizations and pick the fastest or best backend depending upon image format/resolution and desired workflow. Given how easily most operations compose, this could easily make the average workflow 3-8x faster.
As much as I would love to see this, a graph API should be the last part designed, as it requires detailed knowledge of all the backends it can support.

From a practical standpoint, it's best to start with the low-level operations, and expose reusable APIs that others can build on top of. We don't want to chase data structure genericity at a low level. For example, if you expose an interface that supports multiple color spaces or bit depths, you implicitly force APIs to support all of those permutations, many of which will make no or little sense. Most compiler optimizations for inner loops only happen when the channel byte count is known ahead of time; this matters more than you'd think.

Key low-level primitives

Color adjustments

  • Convert from arbitrary color space and profile to sRGB
  • sRGB<->Linear functions, on scanline sets at a time (Operations that do any blending of pixels need to operate in linear).
  • Apply gamma/adjust channels independently
  • Color adjustment matrix application

Image analysis

  • Calculate histogram
  • Auto white balance
  • Fast octree quantizer
  • Face detection (cropping heads off selfies to meet an aspect ratio need is uncool). <- tricky to do a compact implementation, data set is heavy.
  • Document type detection (photograph, document, line art, etc).
  • Detect boundaries (sobel filter, edges inward - can be applied locally with tiny alloc req.s)

Operations requiring matrix transposition (which we avoid at all costs)

  • Mathematically correct interpolation, with custom interpolation weighting callback. Cached weights mean this callback will be invoked only (w x h x scale factor) number of times.
  • Generic convolution kernel applicator, with and without thresholds. Size matters; large kernels will drastically affect performance, and this needs to be clearly documented.
  • Rotate 90 degree intervals
  • Performance constant blur (3x box blur approximates gaussian)
  • Performance constant sharpen

Scale, convolve, rotate 90 degrees, blur, and sharpen - can be composed and require a single transposition. Separately they would require 7.

Trivial operations

  • Flip
  • Crop (doesn't require a copy, just stride & pointer adjustment)
  • Create canvas
  • Fill rectangle with solid color
  • Copy (overwrite)
  • Compositing *(ok, not trivial, but easily managed if you lock down color spaces and alpha premultiplication).

You'll note that affine transform/distort is notably absent. Distortion has exponentially bad performance with image size - it's not linear. Large convolution kernels have a similar effect. Distortion is rarely needed and use should be minimized. Cairo's implementation is fine.

On top of these primitives, and combined with existing codecs, we could build a respectable image library.

@musukvl

This comment has been minimized.

Show comment
Hide comment
@musukvl

musukvl Jun 13, 2015

Please, add EXIF support, because it is hard to rotate image without affect EXIF orientation Tag. Also resized image should have all Exif tags from source image.
Currently we have no crossplatform tool to work with Exif or Exif2, but ability to easy get/set Exif tags can be very useful in this century.

musukvl commented Jun 13, 2015

Please, add EXIF support, because it is hard to rotate image without affect EXIF orientation Tag. Also resized image should have all Exif tags from source image.
Currently we have no crossplatform tool to work with Exif or Exif2, but ability to easy get/set Exif tags can be very useful in this century.

@AlfonsoML

This comment has been minimized.

Show comment
Hide comment
@AlfonsoML

AlfonsoML Jun 13, 2015

One of the main goals of a server is to serve the existing images, so being able to use optimized images will improve both storage and bandwidth usage.

So it would be great to have some low level methods to work on images and provide a guide or higher level methods to quickly optimize them.

The optimizations will depend on the type of image (lossless and lossy, ex, png and jpg) as well as knowing if the optimizations must be done in a lossless way (removing metadata) or some data loss is allowed.

So the first step is to remove metadata from an Image, Adobe is know to include huge amounts of data inside images, and many images might include thumbnails that aren't used. Stripping that away is step 0.

Integrate code from projects like jpegtran, jpegoptim, OptiPng, pngcrunch ... or develop similar code to reach those goals (specific ways to optimize jpg and png)

For Png, recompress the chunks using Zopfli

Being able to use other encoders like mozjpeg

Convert jpgs to subsampling 4:2:0

Recompress based on visual quality instead of a meaningless "quality" percentaje: http://calendar.perfplanet.com/2014/little-rgb-riding-hood-a-jpegs-tale/

Some of these operations are far more CPU expensive than the others, so they can't be blindly applied everywhere, but it's clear that in the long term having an optimized image will be much better than using a bloated one, and being able to optimize an uploaded image will be a great bonus for all the .Net developers, and the good thing is that the different optimizations can be build little by little, start by knowing the final goals and then work on each of the parts trying at least to being as good as other existing tools that can't be used in a managed environment.

AlfonsoML commented Jun 13, 2015

One of the main goals of a server is to serve the existing images, so being able to use optimized images will improve both storage and bandwidth usage.

So it would be great to have some low level methods to work on images and provide a guide or higher level methods to quickly optimize them.

The optimizations will depend on the type of image (lossless and lossy, ex, png and jpg) as well as knowing if the optimizations must be done in a lossless way (removing metadata) or some data loss is allowed.

So the first step is to remove metadata from an Image, Adobe is know to include huge amounts of data inside images, and many images might include thumbnails that aren't used. Stripping that away is step 0.

Integrate code from projects like jpegtran, jpegoptim, OptiPng, pngcrunch ... or develop similar code to reach those goals (specific ways to optimize jpg and png)

For Png, recompress the chunks using Zopfli

Being able to use other encoders like mozjpeg

Convert jpgs to subsampling 4:2:0

Recompress based on visual quality instead of a meaningless "quality" percentaje: http://calendar.perfplanet.com/2014/little-rgb-riding-hood-a-jpegs-tale/

Some of these operations are far more CPU expensive than the others, so they can't be blindly applied everywhere, but it's clear that in the long term having an optimized image will be much better than using a bloated one, and being able to optimize an uploaded image will be a great bonus for all the .Net developers, and the good thing is that the different optimizations can be build little by little, start by knowing the final goals and then work on each of the parts trying at least to being as good as other existing tools that can't be used in a managed environment.

@JimBobSquarePants

This comment has been minimized.

Show comment
Hide comment
@JimBobSquarePants

JimBobSquarePants Jun 14, 2015

In the next few weeks we will have some .NET summer interns start prototyping parts of this library so we are excited to get started!

Given the importance and technical difficulty involved in creating such a library I worry that writing it has been deemed such a low priority that it has been assigned to summer interns. Image processing done correctly is difficult and will require a wealth of experience and expertise.

JimBobSquarePants commented Jun 14, 2015

In the next few weeks we will have some .NET summer interns start prototyping parts of this library so we are excited to get started!

Given the importance and technical difficulty involved in creating such a library I worry that writing it has been deemed such a low priority that it has been assigned to summer interns. Image processing done correctly is difficult and will require a wealth of experience and expertise.

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 14, 2015

In the next few weeks we will have some .NET summer interns start prototyping parts of this library so we are excited to get started!

I did not catch that on the first read-through. Perhaps you should consult with other imaging experts within Microsoft whom you trust (perhaps in Research or WIC) to get their opinion on whether this is the best approach? I personally find it far easier to create or work on compilers than to create correct and performant image processing algorithms; the former is easier to mathematically represent, has fewer variables, and requires less computer science knowledge.

Resources for the interns

There are not many great textbooks on the subject. Here are some from my personal bookshelf. Between them (and Wikipedia) I was able to put together about 60% of the knowledge I needed; the rest I found by reading the source code to many popular image processing libraries.

I would start by reading Principles of Digital Image Processing: Core Algorithms front-to-back, then Digital Image Warping. Wikipedia is also good, although the relevant pages are not linked or categorized together - use specific search terms, like "bilinear interpolation" and "Lab color space".

The Graphics Gems series is great for optimization inspiration:

I'm not aware of any implementations of (say, resampling) that are completely correct. Very recent editions of ImageMagick are very close, though (We got AppHarbor builds going, BTW!). Most offer a wide selection of 'filters', but fail to scale the input or output appropriately, and the error there is greater than the difference between the filters.

Source code to read

I have found the source code for OpenCV, LibGD, FreeImage, Libvips, Pixman, Cairo, ImageMagick, stb_image, Skia, and FrameWave is very useful for understanding real-world implementations and considerations. Most textbooks assume an infinite plane, ignore off-by-one errors, floating-point limitations, color space accuracy, and operational symmetry within a bounded region. I cannot recommend any textbook as an accurate reference, only as a conceptual starting point.

Also, keep in mind that computer vision is very different from image creation. In computer vision, resampling accuracy matters very little, for example. But in image creation, you are serving images to photographers, people with far keener visual perception than the average developer. The images produced will be rendered side-by-side with other CSS and images, and the least significant bit of inaccuracy is quite visible. You are competing with Lightroom; with offline tools that produce visually perfect results. End-user software will be discarded if photographers feel it is corrupting their work.

And, as always, I suggest that it is negligent to start a new project until you have completely read all issues and bug reports filed against similar existing projects. Everything about this space looks deceptively simple, when in fact it is intractably complex and success depends on make the correct compromises on day 1.

lilith commented Jun 14, 2015

In the next few weeks we will have some .NET summer interns start prototyping parts of this library so we are excited to get started!

I did not catch that on the first read-through. Perhaps you should consult with other imaging experts within Microsoft whom you trust (perhaps in Research or WIC) to get their opinion on whether this is the best approach? I personally find it far easier to create or work on compilers than to create correct and performant image processing algorithms; the former is easier to mathematically represent, has fewer variables, and requires less computer science knowledge.

Resources for the interns

There are not many great textbooks on the subject. Here are some from my personal bookshelf. Between them (and Wikipedia) I was able to put together about 60% of the knowledge I needed; the rest I found by reading the source code to many popular image processing libraries.

I would start by reading Principles of Digital Image Processing: Core Algorithms front-to-back, then Digital Image Warping. Wikipedia is also good, although the relevant pages are not linked or categorized together - use specific search terms, like "bilinear interpolation" and "Lab color space".

The Graphics Gems series is great for optimization inspiration:

I'm not aware of any implementations of (say, resampling) that are completely correct. Very recent editions of ImageMagick are very close, though (We got AppHarbor builds going, BTW!). Most offer a wide selection of 'filters', but fail to scale the input or output appropriately, and the error there is greater than the difference between the filters.

Source code to read

I have found the source code for OpenCV, LibGD, FreeImage, Libvips, Pixman, Cairo, ImageMagick, stb_image, Skia, and FrameWave is very useful for understanding real-world implementations and considerations. Most textbooks assume an infinite plane, ignore off-by-one errors, floating-point limitations, color space accuracy, and operational symmetry within a bounded region. I cannot recommend any textbook as an accurate reference, only as a conceptual starting point.

Also, keep in mind that computer vision is very different from image creation. In computer vision, resampling accuracy matters very little, for example. But in image creation, you are serving images to photographers, people with far keener visual perception than the average developer. The images produced will be rendered side-by-side with other CSS and images, and the least significant bit of inaccuracy is quite visible. You are competing with Lightroom; with offline tools that produce visually perfect results. End-user software will be discarded if photographers feel it is corrupting their work.

And, as always, I suggest that it is negligent to start a new project until you have completely read all issues and bug reports filed against similar existing projects. Everything about this space looks deceptively simple, when in fact it is intractably complex and success depends on make the correct compromises on day 1.

@dsplaisted

This comment has been minimized.

Show comment
Hide comment
@dsplaisted

dsplaisted Jun 14, 2015

Member

We're not expecting the interns to implement an entire graphics library. The scope of their summer project is still flexible, but will probably involve wrapping an existing native library with a .NET API.

We want interns to have a great experience at Microsoft, so we try to give them projects that are interesting, that they can be successful with, and that they will feel like they have made an impact. So prototyping parts of a new .NET graphics library is really an ideal intern project. They'll be working with members of the .NET team and together we'll also be relying on community feedback to guide us.

Member

dsplaisted commented Jun 14, 2015

We're not expecting the interns to implement an entire graphics library. The scope of their summer project is still flexible, but will probably involve wrapping an existing native library with a .NET API.

We want interns to have a great experience at Microsoft, so we try to give them projects that are interesting, that they can be successful with, and that they will feel like they have made an impact. So prototyping parts of a new .NET graphics library is really an ideal intern project. They'll be working with members of the .NET team and together we'll also be relying on community feedback to guide us.

@dsplaisted

This comment has been minimized.

Show comment
Hide comment
@dsplaisted

dsplaisted Jun 14, 2015

Member

@nathanaeljones By the way, we really appreciate the feedback you've been giving. When we just gave a hint that we were going to look at doing an image manipulation library, you responded with detailed, useful feedback, and you haven't stopped since. :-)

Member

dsplaisted commented Jun 14, 2015

@nathanaeljones By the way, we really appreciate the feedback you've been giving. When we just gave a hint that we were going to look at doing an image manipulation library, you responded with detailed, useful feedback, and you haven't stopped since. :-)

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 15, 2015

We're not expecting the interns to implement an entire graphics library. The scope of their summer project is still flexible, but will probably involve wrapping an existing native library with a .NET API.

There is an unfortunate lack of overlap between permissively-licensed libraries and libraries that can be wrapped as-is. I completed a low-level generated wrapper and part of a prototype high-level wrapper for LibGD, so I know the scope of work here.

We want interns to have a great experience at Microsoft, so we try to give them projects that are interesting, that they can be successful with, and that they will feel like they have made an impact.

I can't think of a more cruel project to give them - this is one of those projects where you "minimize scope of failure", not "succeed at".

@nathanaeljones By the way, we really appreciate the feedback you've been giving. When we just gave a hint that we were going to look at doing an image manipulation library, you responded with detailed, useful feedback, and you haven't stopped since. :-)

There are few things worse than being assigned to a project where management refuses to acknowledge the scope of the problem you're forced to solve. Since I am strongly against abusing interns, I'm going to keep writing until there is a crystal clear understanding of the challenges involved, and plenty of documentation they can reference in their final report.

I'm focusing on the native side of things, since that is the difficult part. The outer managed API is comparatively trivial.

  • Most existing libraries are designed for one-shot use - like in a script, or with PHP. Memory is not freed after use, or at least, not consistently.
  • Few existing libraries handle allocation failure without segfaulting.
  • Most existing libraries lack any kind of test suite. Wrapping untested C code is not going to end well; access violations are miserable to track down from a managed process.
  • Most existing libraries are trivial to exploit, and have dozens of easily found vulnerabilities. Run valgrind on the library's test suite before you start.
  • Few existing libraries allow custom allocators, yet OS-specific allocation calls are needed to disable paging. GDI+/System.Drawing did this by default, but existing libraries do not. (Paging a bitmap to disk is death for a server. Transposing a 64-million element matrix on disk will completely swamp I/O. Existing back-pressure systems fail here.)
  • Most existing libraries are not re-entrant.
  • Most existing libraries kill the process if any kind of failure occurs.
  • Many existing libraries use static callbacks, which crashes .NET Full processes (cross-domain delegates).
  • Most existing libraries are terribly inefficient; even 10x slower than System.Drawing. (We're talking 1500ms+, unacceptable latency for an HTTP request).
  • Most existing libraries assume all data is trusted. But a server-side program must assume all input is malicious, since it can usually be anonymously submitted.
  • Most existing libraries have completely mismanaged their copyright and licensing; either by changing licenses without contributor sign-off, or by using a custom license that is incompatible with modern prevalent licenses like Apache 2.0 and GPL 3.0
  • Most existing libraries ignore color management completely.
  • Many existing libraries depend on codecs that have unfixed CVEs.
  • Very few libraries expose error detail in a way that can be accurately wrapped in .NET.

I'd love for your team to find a library that just needs to be wrapped, but I've been analyzing and benchmarking different libraries for years without success. Name a library and I can list the issues.

This smells like throwing interns to the wolves on a problem nobody wants to think hard about (or, more likely, be responsible for).

lilith commented Jun 15, 2015

We're not expecting the interns to implement an entire graphics library. The scope of their summer project is still flexible, but will probably involve wrapping an existing native library with a .NET API.

There is an unfortunate lack of overlap between permissively-licensed libraries and libraries that can be wrapped as-is. I completed a low-level generated wrapper and part of a prototype high-level wrapper for LibGD, so I know the scope of work here.

We want interns to have a great experience at Microsoft, so we try to give them projects that are interesting, that they can be successful with, and that they will feel like they have made an impact.

I can't think of a more cruel project to give them - this is one of those projects where you "minimize scope of failure", not "succeed at".

@nathanaeljones By the way, we really appreciate the feedback you've been giving. When we just gave a hint that we were going to look at doing an image manipulation library, you responded with detailed, useful feedback, and you haven't stopped since. :-)

There are few things worse than being assigned to a project where management refuses to acknowledge the scope of the problem you're forced to solve. Since I am strongly against abusing interns, I'm going to keep writing until there is a crystal clear understanding of the challenges involved, and plenty of documentation they can reference in their final report.

I'm focusing on the native side of things, since that is the difficult part. The outer managed API is comparatively trivial.

  • Most existing libraries are designed for one-shot use - like in a script, or with PHP. Memory is not freed after use, or at least, not consistently.
  • Few existing libraries handle allocation failure without segfaulting.
  • Most existing libraries lack any kind of test suite. Wrapping untested C code is not going to end well; access violations are miserable to track down from a managed process.
  • Most existing libraries are trivial to exploit, and have dozens of easily found vulnerabilities. Run valgrind on the library's test suite before you start.
  • Few existing libraries allow custom allocators, yet OS-specific allocation calls are needed to disable paging. GDI+/System.Drawing did this by default, but existing libraries do not. (Paging a bitmap to disk is death for a server. Transposing a 64-million element matrix on disk will completely swamp I/O. Existing back-pressure systems fail here.)
  • Most existing libraries are not re-entrant.
  • Most existing libraries kill the process if any kind of failure occurs.
  • Many existing libraries use static callbacks, which crashes .NET Full processes (cross-domain delegates).
  • Most existing libraries are terribly inefficient; even 10x slower than System.Drawing. (We're talking 1500ms+, unacceptable latency for an HTTP request).
  • Most existing libraries assume all data is trusted. But a server-side program must assume all input is malicious, since it can usually be anonymously submitted.
  • Most existing libraries have completely mismanaged their copyright and licensing; either by changing licenses without contributor sign-off, or by using a custom license that is incompatible with modern prevalent licenses like Apache 2.0 and GPL 3.0
  • Most existing libraries ignore color management completely.
  • Many existing libraries depend on codecs that have unfixed CVEs.
  • Very few libraries expose error detail in a way that can be accurately wrapped in .NET.

I'd love for your team to find a library that just needs to be wrapped, but I've been analyzing and benchmarking different libraries for years without success. Name a library and I can list the issues.

This smells like throwing interns to the wolves on a problem nobody wants to think hard about (or, more likely, be responsible for).

@kendrahavens

This comment has been minimized.

Show comment
Hide comment
@kendrahavens

kendrahavens Jun 16, 2015

@nathanaeljones As the PM intern assigned to this project I really appreciate the study material and all of your insight! I was wondering, what do you think of OpenGL and SDL in this scenario? I saw that you researched it, but I was curious why you decided against it?

Also, I've been here a few weeks and there has been barely any intern abuse. ;) Just so you know, our futures don't depend on whether or not a tool ships. We have great support from devs who will be in the trenches with us and really, we are simply prototyping and identifying issues. The more problems we identify the better arguments we will be able to give for what library Microsoft should take the time to develop on.

kendrahavens commented Jun 16, 2015

@nathanaeljones As the PM intern assigned to this project I really appreciate the study material and all of your insight! I was wondering, what do you think of OpenGL and SDL in this scenario? I saw that you researched it, but I was curious why you decided against it?

Also, I've been here a few weeks and there has been barely any intern abuse. ;) Just so you know, our futures don't depend on whether or not a tool ships. We have great support from devs who will be in the trenches with us and really, we are simply prototyping and identifying issues. The more problems we identify the better arguments we will be able to give for what library Microsoft should take the time to develop on.

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 16, 2015

@nathanaeljones As the PM intern assigned to this project I really appreciate the study material and all of your insight! I was wondering, what do you think of OpenGL and SDL in this scenario? I saw that you researched it, but I was curious why you decided against it?

SDL doesn't have a software implementation of resampling, and I haven't found an effective way to leverage OpenGL yet. Texture rendering isn't resampling. Mipmapping is fine for doing part of a scaling operation, but the last 300% of scaling must be done with a correct interpolation filter. There aren't any primitives in OpenGL or in DirectX that offer acceptable visual quality.

Back in 2013 I created a benchmark to isolate performance issues in DrawImage and compare it to Direct2D. Direct2D's HighQualityCubic implementation was terribly slow (1-2 seconds for a moderately sized image). Same op was < 40ms on the CPU, single threaded.

On the OpenCL front, there's Halide, which is really interesting, but claims OpenCL isn't production ready. The Open CL 2.1 SPIR-V intermediate language looks great, but we're talking about a provisional spec that's easily 5 years from being commonplace.

I haven't been able to tune Halide to approach handwritten C performance, but @jrk probably could.

There's also the question of GPU virtualization consistency. I'm quite hesitant to take the presence of a fast GPU for granted, particularly when falling back to software rendering would be prohibitively slow.

All of the above adds up to nothing more than an "I'm doubtful". I am in no way an OpenGL or OpenCL expert, and I would suggest tracking one down for better answers.

Also, I've been here a few weeks and there has been barely any intern abuse. ;) Just so you know, our futures don't depend on whether or not a tool ships. We have great support from devs who will be in the trenches with us and really, we are simply prototyping and identifying issues. The more problems we identify the better arguments we will be able to give for what library Microsoft should take the time to develop on.

It's great to hear my suspicions are unfounded.

lilith commented Jun 16, 2015

@nathanaeljones As the PM intern assigned to this project I really appreciate the study material and all of your insight! I was wondering, what do you think of OpenGL and SDL in this scenario? I saw that you researched it, but I was curious why you decided against it?

SDL doesn't have a software implementation of resampling, and I haven't found an effective way to leverage OpenGL yet. Texture rendering isn't resampling. Mipmapping is fine for doing part of a scaling operation, but the last 300% of scaling must be done with a correct interpolation filter. There aren't any primitives in OpenGL or in DirectX that offer acceptable visual quality.

Back in 2013 I created a benchmark to isolate performance issues in DrawImage and compare it to Direct2D. Direct2D's HighQualityCubic implementation was terribly slow (1-2 seconds for a moderately sized image). Same op was < 40ms on the CPU, single threaded.

On the OpenCL front, there's Halide, which is really interesting, but claims OpenCL isn't production ready. The Open CL 2.1 SPIR-V intermediate language looks great, but we're talking about a provisional spec that's easily 5 years from being commonplace.

I haven't been able to tune Halide to approach handwritten C performance, but @jrk probably could.

There's also the question of GPU virtualization consistency. I'm quite hesitant to take the presence of a fast GPU for granted, particularly when falling back to software rendering would be prohibitively slow.

All of the above adds up to nothing more than an "I'm doubtful". I am in no way an OpenGL or OpenCL expert, and I would suggest tracking one down for better answers.

Also, I've been here a few weeks and there has been barely any intern abuse. ;) Just so you know, our futures don't depend on whether or not a tool ships. We have great support from devs who will be in the trenches with us and really, we are simply prototyping and identifying issues. The more problems we identify the better arguments we will be able to give for what library Microsoft should take the time to develop on.

It's great to hear my suspicions are unfounded.

@codecore

This comment has been minimized.

Show comment
Hide comment
@codecore

codecore Jun 17, 2015

Perhaps there's some things you can borrow from the OpenTK project. http://www.opentk.com/

codecore commented Jun 17, 2015

Perhaps there's some things you can borrow from the OpenTK project. http://www.opentk.com/

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 17, 2015

OpenTK is cross-plat already, very nice! As far as I can tell it doesn't introduce any rendering features not present in OpenGL, though, so we're still trying to resolve our need for 2D photograph processing with a texture rendering library.

lilith commented Jun 17, 2015

OpenTK is cross-plat already, very nice! As far as I can tell it doesn't introduce any rendering features not present in OpenGL, though, so we're still trying to resolve our need for 2D photograph processing with a texture rendering library.

@richlander

This comment has been minimized.

Show comment
Hide comment
@richlander

richlander Jun 18, 2015

Member

It looks like OpenTK relies on System.Drawing for loading 2D images: http://www.opentk.com/doc/graphics/textures/loading. Maybe you are saying the same thing above @nathanaeljones.

Member

richlander commented Jun 18, 2015

It looks like OpenTK relies on System.Drawing for loading 2D images: http://www.opentk.com/doc/graphics/textures/loading. Maybe you are saying the same thing above @nathanaeljones.

@mellinoe

This comment has been minimized.

Show comment
Hide comment
@mellinoe

mellinoe Jun 18, 2015

Contributor

OpenTK is intended to be a really nice, thin layer around OpenGL (for graphics), I don't think it has anything specifically related to image manipulation. Like Rich said, you're supposed to load textures/images/fonts/etc. with other libraries.

Contributor

mellinoe commented Jun 18, 2015

OpenTK is intended to be a really nice, thin layer around OpenGL (for graphics), I don't think it has anything specifically related to image manipulation. Like Rich said, you're supposed to load textures/images/fonts/etc. with other libraries.

@xoofx

This comment has been minimized.

Show comment
Hide comment
@xoofx

xoofx Jun 19, 2015

Member

That's quite a daunting task...

For an Image Manipulation Library, there is a large difference between:

  1. a library that can just load/save, decompress/convert, flip/resize/crop, transform colors/pixel manips...etc.
  2. a library that can use image as part of a composition/painting/drawing process with svg primitives like lines, circles, rectangles, text rendering as well as image effects, composition by mask, layers...etc. with all the hard problems of anti-aliasing for svg...

Then there is the level of granularity:

  • Do you want to re-implement a fine-grained API like WIC? Have a look at the API. It is going into lots of details, and that's large if you want to cover all these details (And WIC for example doesn't provide anything about flip, crop or pixel apis)
  • Or do you want a coarse-grained API like libGD?

As @nathanaeljones suggested, It is quite tempting to integrate some of the features of 2) for image (like image effects, composition by mask but with only alpha images and not svg layers) into 1), but while feasible, It is not ideal in term of separation of concerns. Imho, sticking to the perimeter of 1) would be better.

If it is mostly for some basic image manipulation on server side, I would expect a high-level API. I don't know much about libGD, but while not perfect, it looks okish for this task.

As mentioned earlier, don't expect anything from OpenGL/OpenCL, unless you want to implement a brand new low-level 2D API that supports HW, but most of a time, you need a software rasterizer anyway in case you don't have access to GPU HW (like on many servers).

Member

xoofx commented Jun 19, 2015

That's quite a daunting task...

For an Image Manipulation Library, there is a large difference between:

  1. a library that can just load/save, decompress/convert, flip/resize/crop, transform colors/pixel manips...etc.
  2. a library that can use image as part of a composition/painting/drawing process with svg primitives like lines, circles, rectangles, text rendering as well as image effects, composition by mask, layers...etc. with all the hard problems of anti-aliasing for svg...

Then there is the level of granularity:

  • Do you want to re-implement a fine-grained API like WIC? Have a look at the API. It is going into lots of details, and that's large if you want to cover all these details (And WIC for example doesn't provide anything about flip, crop or pixel apis)
  • Or do you want a coarse-grained API like libGD?

As @nathanaeljones suggested, It is quite tempting to integrate some of the features of 2) for image (like image effects, composition by mask but with only alpha images and not svg layers) into 1), but while feasible, It is not ideal in term of separation of concerns. Imho, sticking to the perimeter of 1) would be better.

If it is mostly for some basic image manipulation on server side, I would expect a high-level API. I don't know much about libGD, but while not perfect, it looks okish for this task.

As mentioned earlier, don't expect anything from OpenGL/OpenCL, unless you want to implement a brand new low-level 2D API that supports HW, but most of a time, you need a software rasterizer anyway in case you don't have access to GPU HW (like on many servers).

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 19, 2015

I prefer the C API to be as granular as possible without sacrificing performance. I actually like many parts of WIC's API for low-level use, but the implementation leaves some things to be desired, particularly the epic fail that is IWICBitmapScaler.

The chain-based API of WIC is also wasted complexity since the implementation doesn't actually take advantage of it to reduce RAM requirements. If WIC was open-source, one could probably use it with less frustration; the docs are intentionally opaque about operating detail and resource consumption.

I'd also like to re-emphasize that one-size-fits-all is a bad idea here. First, create the low-level API that developers can build on top of. Once there's consensus about the preferred levels of abstraction, then add new APIs that expose them.

Most developers want to do their end-to-end image processing in one line of code; and that is certainly an API they should be given. Don't force them to understand resource lifetimes just to optimize or scale assets.

At the same time, don't make it hard for experts to extend and build on top of. You can't hide pointers and lifetimes from developers without repeating the failures of WPF and System.Drawing. Don't try; make it simple, consistent, and predictable instead of using leaky magical abstractions.

I also agree we don't want to reimplement SVG here. Cairo already exists, why re-create it? Cairo lacks great photo processing and has no image format support to speak of, which is conveniently the part we need most frequently. Sharing a memory layout isn't hard; we can mix libraries at will on the same buffers.

I would draw a small distinction between bitmap composition/effects and vector/layer/tree composition; bitmap alpha blending and basic effects are straightforward, unlike their vector counterparts. Given that we can heavily optimize them for a server context, I'd implement them in the core. Image overlay/watermarking shouldn't force an extra dependency, it's a basic need.

As mentioned earlier, don't expect anything from OpenGL/OpenCL, unless you want to implement a brand new low-level 2D API that supports HW, but most of a time, you need a software rasterizer anyway in case you don't have access to GPU HW (like on many servers).

@xoofx is the no. 1 contributor to SharpDX, so I'm taking his word for it.

lilith commented Jun 19, 2015

I prefer the C API to be as granular as possible without sacrificing performance. I actually like many parts of WIC's API for low-level use, but the implementation leaves some things to be desired, particularly the epic fail that is IWICBitmapScaler.

The chain-based API of WIC is also wasted complexity since the implementation doesn't actually take advantage of it to reduce RAM requirements. If WIC was open-source, one could probably use it with less frustration; the docs are intentionally opaque about operating detail and resource consumption.

I'd also like to re-emphasize that one-size-fits-all is a bad idea here. First, create the low-level API that developers can build on top of. Once there's consensus about the preferred levels of abstraction, then add new APIs that expose them.

Most developers want to do their end-to-end image processing in one line of code; and that is certainly an API they should be given. Don't force them to understand resource lifetimes just to optimize or scale assets.

At the same time, don't make it hard for experts to extend and build on top of. You can't hide pointers and lifetimes from developers without repeating the failures of WPF and System.Drawing. Don't try; make it simple, consistent, and predictable instead of using leaky magical abstractions.

I also agree we don't want to reimplement SVG here. Cairo already exists, why re-create it? Cairo lacks great photo processing and has no image format support to speak of, which is conveniently the part we need most frequently. Sharing a memory layout isn't hard; we can mix libraries at will on the same buffers.

I would draw a small distinction between bitmap composition/effects and vector/layer/tree composition; bitmap alpha blending and basic effects are straightforward, unlike their vector counterparts. Given that we can heavily optimize them for a server context, I'd implement them in the core. Image overlay/watermarking shouldn't force an extra dependency, it's a basic need.

As mentioned earlier, don't expect anything from OpenGL/OpenCL, unless you want to implement a brand new low-level 2D API that supports HW, but most of a time, you need a software rasterizer anyway in case you don't have access to GPU HW (like on many servers).

@xoofx is the no. 1 contributor to SharpDX, so I'm taking his word for it.

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 24, 2015

Also, I have the domain libgd.net, and am willing to transfer it to whomever wishes to take responsibility for said project.

lilith commented Jun 24, 2015

Also, I have the domain libgd.net, and am willing to transfer it to whomever wishes to take responsibility for said project.

@richlander

This comment has been minimized.

Show comment
Hide comment
@richlander

richlander Jun 25, 2015

Member

Nice offer @nathanaeljones! I'm looking forward to seeing where the project goes w/rt what a web presence would look like.

Member

richlander commented Jun 25, 2015

Nice offer @nathanaeljones! I'm looking forward to seeing where the project goes w/rt what a web presence would look like.

@migueldeicaza

This comment has been minimized.

Show comment
Hide comment
@migueldeicaza

migueldeicaza Jul 2, 2015

Member

The stated requirements are a good start, but one of the challenges that you will face very quickly is text rendering, which will very rapidly rule out most trivial or simple solutions.

There are a few existing options that could be used.

  1. Cairo [1], which is a 2D graphics library which implements the PDF rendering model, with a convenient API. It was the foundation that we used for both Mono's System.Drawing implementation and for Moonlight, the open source implementation of Silverlight.

In addition to having .NET bindings, Cairo is also the foundation that the C++ ISO committee is considering for a 2D API [2].

It is in general, a very pleasant library to use.

Cairo has good support for text rendering when combined with Pango [3]: it will support rendering Unicode properly, handle left to right, and right to left (and text with both combined and properly rendered) as well as handling advanced ligature features in fonts and precise layout. In addition, Pango will use the native shaping with Uniscribe on Windows, and CoreText on OSX.

  1. Mono's System.Drawing. This is an implementation of the existing System.Drawing API on top of Cairo or CoreGraphics, and works on Mac (with Cairo or CoreGraphics) and Linux/Unix (with Cairo).

The major downside is that the text layout capabilities of System.Drawing are limited by a design that was barely aware of the complexity of Unicode. The text handling is not suitable for any scripts beyond European languages and the typography support is close to non-existent.

System.Drawing offers a few services on top of Cairo, like reading image metadata and loaders for various file formats.

[1] http://cairographics.org
[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3888.pdf
[3] http://www.pango.org/

Member

migueldeicaza commented Jul 2, 2015

The stated requirements are a good start, but one of the challenges that you will face very quickly is text rendering, which will very rapidly rule out most trivial or simple solutions.

There are a few existing options that could be used.

  1. Cairo [1], which is a 2D graphics library which implements the PDF rendering model, with a convenient API. It was the foundation that we used for both Mono's System.Drawing implementation and for Moonlight, the open source implementation of Silverlight.

In addition to having .NET bindings, Cairo is also the foundation that the C++ ISO committee is considering for a 2D API [2].

It is in general, a very pleasant library to use.

Cairo has good support for text rendering when combined with Pango [3]: it will support rendering Unicode properly, handle left to right, and right to left (and text with both combined and properly rendered) as well as handling advanced ligature features in fonts and precise layout. In addition, Pango will use the native shaping with Uniscribe on Windows, and CoreText on OSX.

  1. Mono's System.Drawing. This is an implementation of the existing System.Drawing API on top of Cairo or CoreGraphics, and works on Mac (with Cairo or CoreGraphics) and Linux/Unix (with Cairo).

The major downside is that the text layout capabilities of System.Drawing are limited by a design that was barely aware of the complexity of Unicode. The text handling is not suitable for any scripts beyond European languages and the typography support is close to non-existent.

System.Drawing offers a few services on top of Cairo, like reading image metadata and loaders for various file formats.

[1] http://cairographics.org
[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3888.pdf
[3] http://www.pango.org/

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jul 2, 2015

Cario is the obvious choice for rendering, and I strongly suggest maintaining a compatible memory layout for the bitmap data. It does lack in the image format and image processing department. I don't think Cairo's scope should be extended, necessarily, but instead a complimentary library should be created to solve those concerns. I advocate a pay-as-you-go approach here to minimize attack surface area.

I would suggest preserving separable components through all layers, so that no layer prevents the developer from stripping out an unnecessary risk.

  • Core bitmap operations, png/jpeg/gif format codecs. Low risk & small codebase size.
  • Vector rendering (Adds Cairo/Pixman). Medium risk.
  • Text rendering (Loads Pango - not sure if Cairo allows Pango to be loaded dynamically). High risk. Font exploits are monthly news.
  • TIFF support. High risk. Tiff exploits are biannual.
  • WebP support
  • Mozjpeg support (medium risk)
  • Metadata parsing (very high risk if native code, low risk if managed, medium if interop is involved for write support).
  • ... additional codecs.

It's very important that high-risk components be used only when needed, and with disclaimers to monitor for security announcements. Given that nearly all use scenarios can avoid them, I think the design should reflect that.

lilith commented Jul 2, 2015

Cario is the obvious choice for rendering, and I strongly suggest maintaining a compatible memory layout for the bitmap data. It does lack in the image format and image processing department. I don't think Cairo's scope should be extended, necessarily, but instead a complimentary library should be created to solve those concerns. I advocate a pay-as-you-go approach here to minimize attack surface area.

I would suggest preserving separable components through all layers, so that no layer prevents the developer from stripping out an unnecessary risk.

  • Core bitmap operations, png/jpeg/gif format codecs. Low risk & small codebase size.
  • Vector rendering (Adds Cairo/Pixman). Medium risk.
  • Text rendering (Loads Pango - not sure if Cairo allows Pango to be loaded dynamically). High risk. Font exploits are monthly news.
  • TIFF support. High risk. Tiff exploits are biannual.
  • WebP support
  • Mozjpeg support (medium risk)
  • Metadata parsing (very high risk if native code, low risk if managed, medium if interop is involved for write support).
  • ... additional codecs.

It's very important that high-risk components be used only when needed, and with disclaimers to monitor for security announcements. Given that nearly all use scenarios can avoid them, I think the design should reflect that.

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jul 9, 2015

Do we have enough data to establish a development plan at this point? Let's restate any remaining questions, as this thread has gotten somewhat long.

lilith commented Jul 9, 2015

Do we have enough data to establish a development plan at this point? Let's restate any remaining questions, as this thread has gotten somewhat long.

@kendrahavens

This comment has been minimized.

Show comment
Hide comment
@kendrahavens

kendrahavens Jul 11, 2015

We aren't quite ready to establish a development plan. While these solutions have taught us a lot about our options in the open source world it has also sparked some internal conversation about cross plat graphics. As we iron out short and long term plans for a cross plat graphics story we are starting to think there might be some allies from other teams with similar perspectives and goals. It has been stated that this is quite a big undertaking so deciding how to best tackle this and organizing the people ready to hop on it will take time. This discussion has done a fabulous job at putting the project in perspective. The feedback here is very appreciated.

We are continuing to prototype with libGD to understand its limits for server-side image manipulation. Many other developer needs have been brought up: text rendering, image file formats, low level GPU accelerated rendering, the high level API experience, the important distinction between a granular library like libGD with basic operations versus more advanced image composition operations, etc. These are part of a very large graphics story and weren't entirely expected to come up at this stage and in this issue. Regardless, the discussion is definitely helping map out the future.

I think remaining questions will come up as we delve deeper into libGD and Cairo to understand their potential, but separate issues can be started for each when they are needed. I'll leave this issue open a little longer in case folks have more to add. :-) Thank you all again for your input!

kendrahavens commented Jul 11, 2015

We aren't quite ready to establish a development plan. While these solutions have taught us a lot about our options in the open source world it has also sparked some internal conversation about cross plat graphics. As we iron out short and long term plans for a cross plat graphics story we are starting to think there might be some allies from other teams with similar perspectives and goals. It has been stated that this is quite a big undertaking so deciding how to best tackle this and organizing the people ready to hop on it will take time. This discussion has done a fabulous job at putting the project in perspective. The feedback here is very appreciated.

We are continuing to prototype with libGD to understand its limits for server-side image manipulation. Many other developer needs have been brought up: text rendering, image file formats, low level GPU accelerated rendering, the high level API experience, the important distinction between a granular library like libGD with basic operations versus more advanced image composition operations, etc. These are part of a very large graphics story and weren't entirely expected to come up at this stage and in this issue. Regardless, the discussion is definitely helping map out the future.

I think remaining questions will come up as we delve deeper into libGD and Cairo to understand their potential, but separate issues can be started for each when they are needed. I'll leave this issue open a little longer in case folks have more to add. :-) Thank you all again for your input!

@ryanbnl

This comment has been minimized.

Show comment
Hide comment
@ryanbnl

ryanbnl Jul 12, 2015

If this progresses beyond an intern project it seems prudent to hire (or come to some arrangement that allows) @nathanaeljones to lead the project.

Personally I don't think it's a priority. In a web/cloud scenario something is very wrong if the use case requires the web-server to process images in-process. Queue + job workers is easier to secure, scale and the IPC overhead can be kept in the low milliseconds.

I'd prefer MS to put resources on easy-win optimizations for all platforms to improve the @TechEmpower benchmark results..

ryanbnl commented Jul 12, 2015

If this progresses beyond an intern project it seems prudent to hire (or come to some arrangement that allows) @nathanaeljones to lead the project.

Personally I don't think it's a priority. In a web/cloud scenario something is very wrong if the use case requires the web-server to process images in-process. Queue + job workers is easier to secure, scale and the IPC overhead can be kept in the low milliseconds.

I'd prefer MS to put resources on easy-win optimizations for all platforms to improve the @TechEmpower benchmark results..

@kendrahavens

This comment has been minimized.

Show comment
Hide comment
@kendrahavens

kendrahavens Jul 20, 2015

Thank you all again for your input. We will let you know when we have ironed out a development plan. We will continue to prototype with libGD and discuss options with the open source libraries mentioned here and ones we are continuing to research.

kendrahavens commented Jul 20, 2015

Thank you all again for your input. We will let you know when we have ironed out a development plan. We will continue to prototype with libGD and discuss options with the open source libraries mentioned here and ones we are continuing to research.

@JanSichula

This comment has been minimized.

Show comment
Hide comment
@JanSichula

JanSichula Jul 29, 2015

I my humble opinion this project should receive a high attention at Microsoft. Versatile and performant cross-platform image manipulation library would add all new value to .NET Core. I would also second idea of Ryanbnl that MS would do best to hire @nathanaeljones for this project.

JanSichula commented Jul 29, 2015

I my humble opinion this project should receive a high attention at Microsoft. Versatile and performant cross-platform image manipulation library would add all new value to .NET Core. I would also second idea of Ryanbnl that MS would do best to hire @nathanaeljones for this project.

@atrauzzi

This comment has been minimized.

Show comment
Hide comment
@atrauzzi

atrauzzi Aug 27, 2015

Issue closed with no reference or resolution?

atrauzzi commented Aug 27, 2015

Issue closed with no reference or resolution?

@migueldeicaza

This comment has been minimized.

Show comment
Hide comment
@migueldeicaza

migueldeicaza Oct 5, 2015

Member

Perhaps it got lost in the noise, but you can use System.Drawing from Mono, it might require a change or two here and there:

https://github.com/mono/mono/tree/master/mcs/class/System.Drawing

It requires this:

https://github.com/mono/libgdiplus

Member

migueldeicaza commented Oct 5, 2015

Perhaps it got lost in the noise, but you can use System.Drawing from Mono, it might require a change or two here and there:

https://github.com/mono/mono/tree/master/mcs/class/System.Drawing

It requires this:

https://github.com/mono/libgdiplus

@JanSichula

This comment has been minimized.

Show comment
Hide comment
@JanSichula

JanSichula Oct 6, 2015

Thanks @migueldeicaza for pointing this out. Now may I ask what are the chances to get https://github.com/mono/mono/tree/master/mcs/class/System.Drawing working against CoreCLR?

JanSichula commented Oct 6, 2015

Thanks @migueldeicaza for pointing this out. Now may I ask what are the chances to get https://github.com/mono/mono/tree/master/mcs/class/System.Drawing working against CoreCLR?

@mellinoe

This comment has been minimized.

Show comment
Hide comment
@mellinoe

mellinoe Oct 6, 2015

Contributor

I don't think it's terribly difficult to get at least the basics working. I've done a really crude port of our Windows System.Drawing (not the mono one), and didn't really hit any snags. I just ported a few things that I actually wanted to use, so I ignored all of the random stuff like Printing, etc. that weren't useful for me. By the way, @akoeplinger has a repo here that includes the System.Drawing sources from mono and seems to compile them for .NET Core: https://github.com/akoeplinger/mono-winforms-netcore

Contributor

mellinoe commented Oct 6, 2015

I don't think it's terribly difficult to get at least the basics working. I've done a really crude port of our Windows System.Drawing (not the mono one), and didn't really hit any snags. I just ported a few things that I actually wanted to use, so I ignored all of the random stuff like Printing, etc. that weren't useful for me. By the way, @akoeplinger has a repo here that includes the System.Drawing sources from mono and seems to compile them for .NET Core: https://github.com/akoeplinger/mono-winforms-netcore

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Oct 6, 2015

Member

By the way, @akoeplinger has a repo here that includes the System.Drawing sources from mono and seems to compile them for .NET Core: https://github.com/akoeplinger/mono-winforms-netcore

Just a heads-up: this is in no way complete and I planned to update it with DNX beta7 but got sidetracked due to other things. I'll probably just update it to beta8 when it comes out next week 😄

Member

akoeplinger commented Oct 6, 2015

By the way, @akoeplinger has a repo here that includes the System.Drawing sources from mono and seems to compile them for .NET Core: https://github.com/akoeplinger/mono-winforms-netcore

Just a heads-up: this is in no way complete and I planned to update it with DNX beta7 but got sidetracked due to other things. I'll probably just update it to beta8 when it comes out next week 😄

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Nov 26, 2015

See dotnet/corefxlab#86 (comment)

We have nobody working actively on this. It's a interesting problem space, but it does not align with our current priorities. - @KrzysztofCwalina

lilith commented Nov 26, 2015

See dotnet/corefxlab#86 (comment)

We have nobody working actively on this. It's a interesting problem space, but it does not align with our current priorities. - @KrzysztofCwalina

@lilith

This comment has been minimized.

Show comment
Hide comment
@lilith

lilith Jun 1, 2016

Today I launched a Kickstarter to fund this effort. You can't shoehorn any existing library to fill this need – I've tried.

If you want to support the project, go here: https://www.kickstarter.com/projects/njones/imageflow-respect-the-pixels-a-secure-alt-to-image

If you can spare a retweet, that would also be fantastic. We need to reach a large audience to make this happen.

lilith commented Jun 1, 2016

Today I launched a Kickstarter to fund this effort. You can't shoehorn any existing library to fill this need – I've tried.

If you want to support the project, go here: https://www.kickstarter.com/projects/njones/imageflow-respect-the-pixels-a-secure-alt-to-image

If you can spare a retweet, that would also be fantastic. We need to reach a large audience to make this happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment