Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize code further to allow optimized one-projection builds #78

Open
mourner opened this issue Nov 28, 2013 · 38 comments
Open

Modularize code further to allow optimized one-projection builds #78

mourner opened this issue Nov 28, 2013 · 38 comments

Comments

@mourner
Copy link

mourner commented Nov 28, 2013

Browsing through the code, it seems that in this state it can't satisfy an extremely common use case of making a custom build for one particular projection.

All projections share a huge "common" file, many other heavy things like "mgrs" with lots of functions for completely different cases are loaded in one monolith, many constants are defined in batch even if only a handful is used in a particular case, etc.

In a perfect Proj4.js, all the files would be split further and dependencies between different chunks defined at a small scale (which browserify makes very easy technically), which would in turn allow us to make a browserify-based build system that can make a perfect tiny build for each particular projection.

I know this has been briefly touched in #9 and #20, but since that issue was closed, I thought it would be nice to bring this up again as a clear goal.

cc @calvinmetcalf @ahocevar @sheppard

@ahocevar
Copy link
Member

Sounds good @mourner. If you have ideas that you can turn into pull requests, that would be awesome!

@mourner
Copy link
Author

mourner commented Nov 28, 2013

Honestly I'm afraid to get into Proj4js code because I'm a total noob in GIS and weird projections, so will probably break something without knowing. :) But I'll see what I can do to help.

@calvinmetcalf
Copy link
Member

The projections are all loaded in one file (prrojections/index.js) a good
first step might be to remove that file and load projections individual
On Nov 28, 2013 7:09 AM, "Vladimir Agafonkin" notifications@github.com
wrote:

Honestly I'm afraid to get into Proj4js code because I'm a total noob in
GIS and weird projections, so will probably break something without
knowing. :) But I'll see what I can do to help.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-29459450
.

@calvinmetcalf
Copy link
Member

What was done to constants can be done to common, and then the projections
can start importing the individual functions
On Nov 28, 2013 8:28 AM, "Calvin Metcalf" calvin.metcalf@gmail.com wrote:

The projections are all loaded in one file (prrojections/index.js) a good
first step might be to remove that file and load projections individual
On Nov 28, 2013 7:09 AM, "Vladimir Agafonkin" notifications@github.com
wrote:

Honestly I'm afraid to get into Proj4js code because I'm a total noob in
GIS and weird projections, so will probably break something without
knowing. :) But I'll see what I can do to help.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-29459450
.

@probins
Copy link

probins commented Dec 8, 2013

further to this, I'd float the idea of making the defs json objects, and then using one or more of those to determine which modules need to be included.

@calvinmetcalf
Copy link
Member

So that was how it used to work, but this made it overly complicated to use
and set up, I envision the work flow being a version with all projections
which is great for situations where people are bringing in their own data,
and the ability to make custom versions with less projections if you only
have to switch between a couple things
On Dec 8, 2013 8:11 AM, "Peter Robins" notifications@github.com wrote:

further to this, I'd float the idea of making the defs json objects, and
then using one or more of those to determine which modules need to be
included.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30081220
.

@calvinmetcalf
Copy link
Member

But we could make it easy for others to do async builds, like maybe can a
function if it can't find the projection. Before throwing the error
On Dec 8, 2013 9:24 AM, "Calvin Metcalf" calvin.metcalf@gmail.com wrote:

So that was how it used to work, but this made it overly complicated to
use and set up, I envision the work flow being a version with all
projections which is great for situations where people are bringing in
their own data, and the ability to make custom versions with less
projections if you only have to switch between a couple things
On Dec 8, 2013 8:11 AM, "Peter Robins" notifications@github.com wrote:

further to this, I'd float the idea of making the defs json objects, and
then using one or more of those to determine which modules need to be
included.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30081220
.

@probins
Copy link

probins commented Dec 8, 2013

why would that make it complicated to use? ISTM it's much simpler. I just present the build process with a def.json that says for example:

{
  "proj": "utm",
  "zone": "31",
  "ellps": "GRS80",
  "units": "m"
}

and then say 'gimme a build that supports that - and assume I want to convert into/out of WGS84 coords'. I don't need to know what any of those properties means, nor do I need to know anything about the code.

As @mourner says, only having one projection is an extremely common use case. Many of the possible projections are rarely used, and particularly in the browser, downloading a load of code you never use just wastes bandwidth.

@calvinmetcalf
Copy link
Member

Oh I misunderstood you I thought you meant you wanted it to async load the
projections in the browser, building for one projection IM down with
On Dec 8, 2013 10:18 AM, "Peter Robins" notifications@github.com wrote:

why would that make it complicated to use? ISTM it's much simpler. I just
present the build process with a def.json that says for example:

{
"proj": "utm",
"zone": "31",
"ellps": "GRS80",
"units": "m"}

and then say 'gimme a build that supports that - and assume I want to
convert into/out of WGS84 coords'. I don't need to know what any of those
properties means, nor do I need to know anything about the code.

As @mourner https://github.com/mourner says, only having one projection
is an extremely common use case. Many of the possible projections are
rarely used, and particularly in the browser, downloading a load of code
you never use just wastes bandwidth.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30083555
.

@calvinmetcalf
Copy link
Member

Sorry went back and read your original post. Here is the deal.

Conditional loading only is helpful in the browser, not node. To load stuff
conditionally in the browser we'd need to rewrite the entire library to be
asynchronous as that's how loading works in the browser and I would prefer
not to do this.

That being said we can do a few things, we can add a way to catch a missing
projection so that you could load it async yourself . The other thing we
can do is make it easier to build it with only the projections you want,
and to avoid including code that is only required for projections we aren't
using.
On Dec 8, 2013 10:23 AM, "Calvin Metcalf" calvin.metcalf@gmail.com wrote:

Oh I misunderstood you I thought you meant you wanted it to async load the
projections in the browser, building for one projection IM down with
On Dec 8, 2013 10:18 AM, "Peter Robins" notifications@github.com wrote:

why would that make it complicated to use? ISTM it's much simpler. I just
present the build process with a def.json that says for example:

{
"proj": "utm",
"zone": "31",
"ellps": "GRS80",
"units": "m"}

and then say 'gimme a build that supports that - and assume I want to
convert into/out of WGS84 coords'. I don't need to know what any of those
properties means, nor do I need to know anything about the code.

As @mourner https://github.com/mourner says, only having one
projection is an extremely common use case. Many of the possible
projections are rarely used, and particularly in the browser, downloading a
load of code you never use just wastes bandwidth.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30083555
.

@probins
Copy link

probins commented Dec 8, 2013

I am not talking about async loading in the browser; that's an entirely separate issue. I am already (with version 1.1) loading custom proj builds in the browser asynchronously (i.e. as needed), but they are far larger than they need to be. I'm just suggesting that, if there is a build script to create a single-projection build, a proj.def be used as a config to that script.

@calvinmetcalf
Copy link
Member

I'm getting you, I agree building with a proj string or wkt would be a
great feature.

Using the proj4 def json would be even better but is dependent on us
cleaning up and standardizing that, it's currently a mess but yeah that
should be the long term goal.
On Dec 8, 2013 10:50 AM, "Peter Robins" notifications@github.com wrote:

I am not talking about async loading in the browser; that's an entirely
separate issue. I am already (with version 1.1) loading custom proj builds
in the browser asynchronously (i.e. as needed), but they are far larger
than they need to be. I'm just suggesting that, if there is a build script
to create a single-projection build, a proj.def be used as a config to that
script.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30084142
.

@probins
Copy link

probins commented Dec 8, 2013

just noticed #27 (comment)
so @ahocevar is 6 months ahead of me :-)

@sheppard
Copy link
Contributor

sheppard commented Dec 8, 2013

Actually, I think conditional loading is a good way to address the custom build scenario, even without necessarily supporting the asynchronous / browser use case. For example, you might rewrite projections.get() to only require() the projection when requested:

var projs = projections.registry = {}; // i.e. projStore
projections.get = function(name) {
    if (!projs[name]) {
        try {
            // Conditional sync module load - will work in node,
            // but will fail in the browser unless module is already in build
            projs[name] = require('./' + name);
        } catch (e) {
           // Bail out with informative message
           // (Caller could attempt an async load, then request the proj again)
        }
    }
    return projs[name];
};

When generating the custom build, you'd simply run proj4 on the requested projection, then use the projections.registry to generate an includes.js for the actual build (since your build system probably isn't going to do much with the conditional requires).

You could even use this approach to build the main proj4 distribution: as long as you give it a complete set of representative projections, eventually all of the needed modules will be loaded. It's analogous to a coverage test - if any module is never included, that is an indicator that it might not actually be needed at all. This approach would also have the advantage of utilizing proj4's built in projection parsing code, rather than recreating parts of it in a separate build tool.

I was going to conclude with another plug for async modules but I'll leave that for another time :)

@calvinmetcalf
Copy link
Member

We are on same page in general, but we can end up doing it without the try.
Step 1 is to do it with proj names step 2 is based on proj strings and wkt
I think.
On Dec 8, 2013 6:18 PM, "S. Andrew Sheppard" notifications@github.com
wrote:

Actually, I think conditional loading is a good way to address the custom
build scenario, even without necessarily supporting the asynchronous /
browser use case. For example, you might rewrite projections.get() to
only require() the projection when requested:

var projs = projections.registry = {}; // i.e. projStoreprojections.get = function(name) {
if (!projs[name]) {
try {
// Conditional module load - will work in node,
// but will fail in the browser unless module is already in build
projs[name] = require('./' + name);
} catch (e) {
// Bail out with informative message
// (Caller could attempt an async load, then request the proj again)
}
}
return projs[name];};

When generating the custom build, you'd simply run proj4 on the requested
projection, then use the projections.registry to generate an includes.jsfor the actual build (since your build system probably isn't going to do
much with the conditional requires).

You could even use this approach to build the main proj4 distribution: as
long as you give it a complete set of representative projections,
eventually all of the needed modules will be loaded. It's analogous to a
coverage test - if any module is never included, that is an indicator that
it might not actually be needed at all. This approach would also have the
advantage of utilizing proj4's built in projection parsing code, rather
than recreating parts of it in a separate build tool.

I was going to conclude with another plug for async modules but I'll leave
that for another time :)


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30096190
.

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

In theory, once step 1 is solved you could just throw it a wkt and check the projections.registry to see what it came up with.

@calvinmetcalf
Copy link
Member

the relevant code is here but yes it's more or less exactly what your talking about the external call is proj4.Proj.projections.get(projName) and you can add a projection with proj4.Proj.projections.add(require('blah')) I was thinking we can use browserify to sub out this file to get the smaller builds.

@probins
Copy link

probins commented Dec 9, 2013

a further point on this is that these aren't really 'custom' builds: they aren't site-specific. Once a build has been generated for a particular projection, it could be stored somewhere and then used by any site that needs that projection.

@calvinmetcalf
Copy link
Member

@probins true, though when you have 2 and 3 projection builds is where we would run into problems, one thing we could have a core build with no projections and a file for each projection, then all you need to do is include the core and the varius projections you need (this would be in addition to the current all build)

@probins
Copy link

probins commented Dec 9, 2013

yes, but I think @mourner's point is that it's very common to only use 1 projection, and want to transform 4326 coords in and out of that projection. Having >1 projection is a different use case.

@calvinmetcalf
Copy link
Member

nothing I'm planning is going to prevent us from doing one projection
builds, I'm just thinking this all the way through, people are probobly
going to want tmerc + lcc (aka state plane) far more often then they'll
want somerc by itself.

On Mon, Dec 9, 2013 at 9:20 AM, Peter Robins notifications@github.comwrote:

yes, but I think @mourner https://github.com/mourner's point is that
it's very common to only use 1 projection, and want to transform 4326
coords in and out of that projection. Having >1 projection is a different
use case.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30135101
.

-Calvin W. Metcalf

@calvinmetcalf
Copy link
Member

to summarize the ideas we all have had

  1. create the ability to make builds with custom projections
    1. be able to pass proj strings or wkt to this
  2. publish to bower, cdn, repo? pre built versions with popular builds
  3. factor the projections into plugins
    1. installable via npm (maybe bower)
    2. includable as separate script tags
    3. create hook to allow it to be downloaded async if you want

1.i would be helped by cleaning up wkt and projString
2 could be done down stream or automated like lodash did
3.i could be done by separating out the projections into their own repos
3.ii could be done via the build script
3.iii could be done with something similar to what @sheppard suggested though we could simplify it to remove the try and the require (node doesn't get an advantage by lazy loading like that, it's illegal in es6 modules, though I may be missing an advantage that @sheppard had in mind)

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

We probably don't need the try, though the dynamic require might still be useful. The example I gave didn't really demonstrate my reasoning very well - it comes from an earlier thought I had, which was that we could use the module system itself to determine which modules needed to be included in a build. For example, if someone requests a build for utm, tmerc (and its dependencies) should also be included in the build. Rather than maintaining all of this information separately and redundantly (e.g. dependsOn attributes or a separate build info file), I'm thinking we should find a way to use the requires (and only the requires) to compute the information.

So, the build system I had in mind would actually run the proj4 source files on a projection, and then inspect the module system internals to see which files had actually been included. There at least two key problems with my approach:

  • One, I'm not sure how easy it is to inspect the module system to see what has loaded (RequireJS has defined(), though that won't be too useful here)
  • Two, as @calvinmetcalf noted, dynamic requires like this will probably be illegal in ES6

At any rate, there are probably more elegant ways to do this with ASTs or something. My main point just is that we should be able to use the proj4 source itself when computing a build, rather than duplicating projection information in separate build tool. There are a number of ways this could be done.

@calvinmetcalf
Copy link
Member

my thought was that if some one wanted just utm and tmerc we could either just use the browserify ability to alias files to replace https://github.com/proj4js/proj4js/blob/master/lib/projections/index.js with one that did have these lines https://github.com/proj4js/proj4js/blob/master/lib/projections/index.js#L4-L24 or just include 2 files in the build which had something like

require('proj4').Proj.projections.add(require('./lib/projections/utm'));

and

require('proj4').Proj.projections.add(require('./lib/projections/tmerc'));

in them, this should get browserify to include the relevant files

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

Sure, that would work. I'm not so much talking about how to actually include the files (either of your examples seem fine), but more about how we would compute which files we actually needed. We need some way to take a projection definition (WKT etc.) and know just by looking at it what modules to include. I'm just proposing we use the actual module requires() in some way, rather than duplicating the dependency information somewhere else.

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

Re: 3.iii (async loading), the most important takeaway from my example is that a hook won't really be able to do much except report which modules are missing - it won't be possible to switch to async mode while in the middle of loading a projection. proj4 would need to bail out and leave it to the user to request the modules async, and then call the sync API again to try loading the projection. They would need some indicator of which modules were actually missing so they knew what to load. The computation of which modules are missing might be very similar to the computation for a build, which is another motivation to include that computation code in proj4 itself rather than in a separate build script. (The other option is to tell the async loading user that they need to know which additional modules they need, and that they should load these modules before calling proj4.)

Re: 3.ii (separate script tags) - I must admit this one makes me a bit sad, as an async (i.e. AMD) module loader would handle this use case just fine and use only one script tag (other than the ones it creates automatically to load deps). Also, I assume it is not possible to require() individual modules from other browserify bundles once they have a UMD wrapper. So, to support this workflow with browserify we would need to either:

  • create a non-UMD browser build (breaking AMD use cases), or
  • rewrite each projection module to rely only on proj4's public API (or is this what you had in mind anyway?)

At any rate, it will be hard to effectively support the async use case without adding an async API to the proj4 source in some way. It doesn't need to be a high priority now, but I don't think it can be put off indefinitely. At some point, I would like to explore the option of bringing back the AMD modules (probably just as define-wrapped source files, and leave it to AMD users to create a custom build). I think we could do this without much extra work by leveraging r.js' conversion tool in a supplemental build process.

I understand that node integration is the priority now, but I would argue that the browser use case is at least equally valid and (as you know) am concerned about whether browserify can accomplish everything we need.

@calvinmetcalf
Copy link
Member

so from the first one comment, I was envisioning calling lib/projString.js or lib/wkt.js to figure out which projection is necessary.

so I was thinking that a hook could go here which could be handled by something async, even if it is something crude at first like loads the module and then tries again

Re Re: 3.ii I should have also included, load via amd in that list,:) there is no reason you wouldn't be able to load proj4, and any modules you want through an async module loader there,

we can write a script in grunt that allows us to do grunt build --include=tmerc,utm, you should also be able to do something like grunt build --nodeps and then latter do require([proj4,tmerc,utm],function(proj4,tmerc,utm){proj4.Proj.projections.add(tmerc);proj4.Proj.projections.add(utm);...

currently the projections are fairly self contained and only require functions out of the common module which I now split up making them easy to create individually.

lastly we are not abandoning amd users, all of the files we are producing are wrapped in defines, thats the point of umd

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

Sure, I didn't mean abandonment, just that certain workflows (in particular the cross-module, async ones) would be much easier with AMD source modules. If the projections are mostly self-contained anyway, then the UMD wrappers should work fine. Is the idea that you would include bits of the (former) common module in each standalone projection module? Anything not exposed via the public API will need to be included in each built projection module, which could lead to some redundancy as there would be a bunch of copies of e.g. adjust_lon. If we could use cross-file requires of individual source modules (either via AMD or non-UMD browserify) these common elements could be included only once. It's probably not a big deal (and doesn't matter at all for the single-projection use case), but it's something to consider.

Ok, I think I get what you're thinking with the initTransform hook. Presumably the user could just publish a folder with a build for each projName they might need (and each file would have all of the projName's deps inlined). Again, there is a potential for redundant common code, but maybe that's ok.

@calvinmetcalf
Copy link
Member

I was thinking that for deployment purposes (where you'd care about the extra bits) you'd build it as one file and there would be minimal duplication, in any case as long as all the files where part of the same file, i.e. concated, gzip should handle it.

@calvinmetcalf
Copy link
Member

if we clean up the proj4 constructor (which is on my list) it could be much easier to make it somewhat async

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

You could imagine SaaS-type applications where you'd intentionally deploy a limited proj4 and load the projection modules on a per-user basis. So, the redundancy will be unavoidable in that case - but whether that matters depends on the amount of common overlap. Would it be useful to put some file size estimates together? (If it were me, I'd probably just deploy the complete proj4 anyway...)

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

Yeah even "just" a callback version of the constructor would cover a lot of async use cases. (I say "just" since I'm not sure how much internal refactoring that would entail).

@calvinmetcalf
Copy link
Member

it would probably be the kind of refactoring we'd want to do anyway

On Mon, Dec 9, 2013 at 1:56 PM, S. Andrew Sheppard <notifications@github.com

wrote:

Yeah even "just" a callback version of the constructor would cover a lot
of use cases. (I say "just" since I'm not sure how much internal
refactoring that would entail).


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30160869
.

-Calvin W. Metcalf

@sheppard
Copy link
Contributor

sheppard commented Dec 9, 2013

Back to the single-projection build - might that just be concatenating proj4 core and one standalone projection module?

@calvinmetcalf
Copy link
Member

lemme open a pull to show you

On Mon, Dec 9, 2013 at 2:16 PM, S. Andrew Sheppard <notifications@github.com

wrote:

Back to the single-projection build - would that just be concatenating
proj4 and one standalone projection module?


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-30163718
.

-Calvin W. Metcalf

@calvinmetcalf
Copy link
Member

@sheppard check out the builds branch. grunt works like normal, but grunt build only includes latlon and merc (the two that we have hard coded espg codes for). grunt build:utm,lcc also includes utm and lcc etc. Currently only works with internal projection codes (aka filename minus the .js) but that should be easy once I figure out the best way to pass it in (we could do files, just strings on the command line, or something else...).

But the good news is that basic one that can do just merc, 7101 byes gzipped and minified. yes you read that right less then 7 KB and that's including cruft like datum transformation, so you could use that to do nad83 lat longs to web mercator.

@mourner
Copy link
Author

mourner commented Dec 10, 2013

@calvinmetcalf that is awesome! Great work!

@sheppard
Copy link
Contributor

@calvinmetcalf nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants