Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about proj.h #529

Closed
rouault opened this issue Jun 22, 2017 · 20 comments
Closed

Questions about proj.h #529

rouault opened this issue Jun 22, 2017 · 20 comments

Comments

@rouault
Copy link
Member

rouault commented Jun 22, 2017

We should probably remove the following :

extern int pj_errno; /* global error return code */   ==> we don't want to encourage relying on this legacy mechanism

Questions :

  • how do you know if pj_trans() result an invalid result ? Call pj_error() ?
  • What is the purpose of #define PJ_ERR_TELL -56789 ?
  • What is the range of values accepted by the level arg of pj_err_level() ?
  • Difference between pj_err_level() and pj_log_level() ?
  • Do we need pj_log_error(), pj_log_debug(), pj_log_trace() in the public API ? Look like they are just for internal needs
  • Do we need pj_context_XXXX() functions in the public API ?
  • struct projCtx_t; typedef struct projCtx_t PJ_CONTEXT; typedef int *PJ_FILE; are not used anywhere else in proj.h. Why do we need them there ?
  • GDAL and QGIS use pj_get_def() as a way of getting a "normalized" version of a proj string (+init= expansion, things related to datums too). Is it something that should be found in the new API too ?
  • Do we need pj_shutdown in publi API ?
  • Do we need pj_coo_null and pj_obs_null in public API ?
  • Why do we have all those standard includes at the top of the file ? This should be avoided I think. None of them seem necessary for the use of the API

Suggestions:

  • TODEG and TORAD should probably be prefixed by PJ_ to avoid name clashes
  • One of the nice property of pj_transform() is the ability to pass several coordinates at once. We don't use vector optimizations, but the pj_transform() API would be potentially compatible of them. It would probably be a good idea to have the equivalent with the new API. But so that vector instructions work properly the X,Y,Z,T values should go into separate arrays + a stride parameter like current pj_transform(). For now this would just be a loop over the pj_trans() API
@kbevers
Copy link
Member

kbevers commented Jun 22, 2017

I know that @busstoptaktik is working on some improvements for proj.h at the moment. At least your last suggestion is being taken care of as part of that. Using PJ_TORAD in favour of TORAD is fine by me.

how do you know if pj_trans() result an invalid result ? Call pj_error() ?

It returns pj_obs_error() which is a PJ_OBS struct with HUGE_VAL's as coordinate components. The error level is also set to EINVAL, which can be read by calling pj_err_level(0, PJ_ERR_TELL).

What is the purpose of #define PJ_ERR_TELL -56789 ?

See above.

Difference between pj_err_level() and pj_log_level()

One sets the error level, the other controls how much noise the program makes. Higher log level, higher output to stdout (or where ever).

Do we need pj_log_error(), pj_log_debug(), pj_log_trace() in the public API ? Look like they are just for internal needs

I agree that they should be private. We need a place to put them though. proj_internal.h?

Do we need pj_context_XXXX() functions in the public API ?

They mainly exists to keep some sort of backwards compatibility. If we can agree on (some form of) my proposal on the mail list to day, we are free to change that behaviour all together, which I think will be a good thing.

struct projCtx_t; typedef struct projCtx_t PJ_CONTEXT; typedef int *PJ_FILE; are not used anywhere else in proj.h. Why do we need them there ?

PJ_CONTEXT is used in pj_obs_api.c. Thomas elaborates a bit about it in the header of proj.h.

@kbevers
Copy link
Member

kbevers commented Jun 22, 2017

Also, I agree that something should be done about pj_errno.

@rouault
Copy link
Member Author

rouault commented Jun 22, 2017

It returns pj_obs_error() which is a PJ_OBS struct with HUGE_VAL's as coordinate components. The error level is also set to EINVAL, which can be read by calling pj_err_level(0, PJ_ERR_TELL).
One sets the error level

Hum this will need to be documented somewhere. I still don't get how pj_err_level() works and what "error level" means

I agree that they should be private. We need a place to put them though. proj_internal.h?

Yes

PJ_CONTEXT is used in pj_obs_api.c

Implementation detail then. Candidate for proj_internal.h

@kbevers
Copy link
Member

kbevers commented Jun 22, 2017

Hum this will need to be documented somewhere.

Yeah, there's already a ticket for that: #442. As far as I know Thomas is working on it. Some older un-merged work is here: https://github.com/OSGeo/proj.4/blob/687e591b6d4cdb3370f2bb50fedbda57da83e76c/docs/source/development/obsapi.rst

@rouault
Copy link
Member Author

rouault commented Jun 22, 2017

Other questions :

  • should pj_free() be used with the same name for proj.h and proj_api.h ? If we keep proj_api.h as a legacy way but don't discontinue it, at some point we might want the objects returned by the old pj_init() family and the new pj_create() family to be different, and thus freable with different methods. Well, as the structures are opaque, we could probably use a trick to know which object it is, but that would be dirty.
  • Shouldn't all the new api be prefixed with something like pj_pipeline_ or pj_transform_ (or pj2017_ ;-) ) to avoid confusion with the older APIs ?

Suggestion:

  • do not add pj_fileapi_set() in proj.h. The need is not obvious at all for now

@rouault
Copy link
Member Author

rouault commented Jun 22, 2017

Do we need pj_context_XXXX() functions in the public API ?

They mainly exists to keep some sort of backwards compatibility. If we can agree on (some form of) my proposal on the mail list to day, we are free to change that behaviour all together, which I think will be a good thing.

The presence of those context explicit functions in proj.h is very surprising, given that the concept itself is absent from other proj.h API. I'm not sure of which use cases they are supposed to cover. I believe they should be removed.

@kbevers
Copy link
Member

kbevers commented Jun 23, 2017

should pj_free() be used with the same name for proj.h and proj_api.h ? If we keep proj_api.h as a legacy way but don't discontinue it, at some point we might want the objects returned by the old pj_init() family and the new pj_create() family to be different, and thus freable with different methods. Well, as the structures are opaque, we could probably use a trick to know which object it is, but that would be dirty.

Good point. So far the ambition has been to use the same underlying objects, but yeah, that might not be possible in the future. Could be fixed with a new prefix as you suggest.

Shouldn't all the new api be prefixed with something like pj_pipeline_ or pj_transform_ (or pj2017_ ;-) ) to avoid confusion with the older APIs ?

I am -1 on that. At leas those prefixes at least. One of the main ideas with the new API was to create a simple and clean API. pj_* looks clean, and so far we have made it a point to reuse as few names
as possible (which id why we have pj_trans, pj_transform was occupied). An alternative to that could be using proj_* instead. That could work, and opens up using names already used in proj_api.h

The presence of those context explicit functions in proj.h is very surprising, given that the concept itself is absent from other proj.h API.

It is not though. For every PJ object you create, there's an attaced ctx context-object. In many functions in proj_api.h a ProjCtx is used rather than a projPJ, leading to lots of functions calls like pj_ctx_set_errno(pj_get_ctx(P)) whereas that has been simplified in proj.h to always take a PJ object and then infer the context from that, i.e. pj_err_level(P, 3). Makes sense?

I'm not sure of which use cases they are supposed to cover. I believe they should be removed.

Same cases as the existing contexts: Multithreaded applications.

@rouault
Copy link
Member Author

rouault commented Jun 23, 2017

An alternative to that could be using proj_* instead

Yeah, better

For every PJ object you create, there's an attaced ctx context-object

For the user of proj API, couldn't this be a hidden implementation detail ? If you consider that the addition of the context concept made the proj_api a bit messy, we should try very hard not to expose it in the new cleaner API.

Same cases as the existing contexts: Multithreaded applications.

Show me real-world scenarios that need those functions. Why would I want to "renew" a context, or transfer it to some other objects. For me, the requirements for a multithreaded application are :

  • as a user of proj.h, I want to be able to instanciate, use and destroy PJ* objects in multiple threads
  • ... but ..., as a user of proj.h, I promise I will not use the same PJ* object from multiple threads at the same time. I may use the same PJ* object from dfiferent threads in time, but it is my business to coordinate my threads so they don't use it at the same time.
    So technically this means for the proj implementation: instanciate a new ctx for every PJ* object you create.
    Unless I'm missing something...

Those requirements are what most multithreaded-compatible libraries do

@kbevers
Copy link
Member

kbevers commented Jun 23, 2017

For the user of proj API, couldn't this be a hidden implementation detail ? If you consider that the addition of the context concept made the proj_api a bit messy, we should try very hard not to expose it in the new cleaner API.

Yes, it could. It was included in to have a 1-to-1 relationship between proj_api.h and proj.h, in order to ease migration from one to the other. We don't have to go down that road if we plan on keeping proj_api.h as a legacy API.

Show me real-world scenarios that need those functions. Why would I want to "renew" a context, or transfer it to some other objects. For me, the requirements for a multithreaded application are :

I have discussed this at length with Thomas, and we are both also confused about why it was made the way it was originally. The only thing that has come up so far, I think, is that it might be useful for several threads to log to the same file, via a shared context. But even that I think is a a highly doubtful use case. Personally I am happy to remove that functionality.

@rouault
Copy link
Member Author

rouault commented Jun 23, 2017

I have discussed this at length with Thomas, and we are both also confused about why it was made the way it was originally.

See http://osgeo-org.1560.x6.nabble.com/threadsafety-and-PROJ-4-the-projCtx-API-Update-td3841784.html . So bascially as I said just to avoid the issue of the global pj_errno

The only thing that has come up so far, I think, is that it might be useful for several threads to log to the same file, via a shared context. But even that I think is a a highly doubtful use case.

Looking at the above message, I'm pretty sure this wasn't at all in mind of Frank

@kbevers
Copy link
Member

kbevers commented Jun 23, 2017

Looking at the above message, I'm pretty sure this wasn't at all in mind of Frank

I think that is correct. The concern has more been: How has it been used in the wild? So far I haven't seen any surprising uses. The same goes for the FileAPI functions. So I think it will be better to hide contexts and FileAPI in the new API.

@busstoptaktik
Copy link
Member

busstoptaktik commented Jun 23, 2017 via email

@rouault
Copy link
Member Author

rouault commented Jun 23, 2017

As a summary of all my above remarks, here's a proposal of what I'd consider as a nice proj.h : https://gist.github.com/rouault/890e1c4cf0da5a282c292625fdf20917#file-my-vision-ofa-clean-proj-h

Note: I've removed absolutely all includes to other files.

@kbevers
Copy link
Member

kbevers commented Jun 23, 2017

Can I suggest that you make a new gist, that starts with the current proj.h and the edit it with your changes? That way the changes show up in the diff. Makes it a little easier to track the changes.

Should we change the prefix to proj_?

@rouault
Copy link
Member Author

rouault commented Jun 23, 2017

Can I suggest that you make a new gist, that starts with the current proj.h and the edit it with your changes?

Here it is https://gist.github.com/rouault/fe6fa8b610c0c5e15cb3f0c592f36bd5/revisions

Should we change the prefix to proj_?

I've no definite opinion on this. I've tried a few proj_ likely symbol names with codesearch.debian.net and found that proj_free ( https://codesearch.debian.net/search?q=proj_free ) is actually used by the VTK project. The irony is that this function comes from the original libproj, embedded in VTK sources. But apparently Debian uses a patch to switch to the regular external proj.4: http://sources.debian.net/src/vtk6/6.3.0%2Bdfsg1-5/debian/patches/102_enable_system_proj4_lib.patch/#L115
But the bottom line is that proj_ is associated with the original libproj

@kbevers
Copy link
Member

kbevers commented Jun 23, 2017

Thanks!

But the bottom line is that proj_ is associated with the original libproj

prj_ ? or perhaps proj4_, pj4_ or prj4_? Although the 4 seems a bit out of place...
If there is only one occurrence of proj_ being used I don't think it is a big problem.

@rouault
Copy link
Member Author

rouault commented Jun 23, 2017

Just added PROJ_VERSION_MAJOR, _MINOR and _MICRO to my proposal.

prj_ seems a reasonable choice.

@busstoptaktik
Copy link
Member

I would prefer proj_ and then using an alternative name for the destructor function - e.g proj_destroy, or proj_delete, being antonyms to proj_create. Whatever name, it would probably just be a thin wrapper for pj_free...

@busstoptaktik
Copy link
Member

See http://osgeo-org.1560.x6.nabble.com/threadsafety-and-PROJ-4-the-projCtx-API-Update-td3841784.html .
So bascially as I said just to avoid the issue of the global pj_errno

@rouault - Strange - Frank's post is from june 2010, and elaborates on the problems of implementing thread local storage in a platform independent manner. Just a year later, we introduced platform independent thread local storage for the trlib transformation library (source of proj's transverse mercator and 2D Horner code) using just a single header file to provide some syntactic sugar, and relief for Microsoft-inflicted pains: See comments over at https://bitbucket.org/KMS/trlib/src/eafee76d1127fa1db89ac9c15a9edafabcb599be/TR_INC/trthread.h

The idea was to build a single thread library by default, and go thread safe only if the symbol "THREAD" was defined.

I realize, though, that PROJ.4 do production runs on many more platforms before lunch, than trlib was ever just marginally tested on (although trlib was intended to run on any platform - including Windows - providing just a bare minimum of posixy blessings).

Going forward with a new API I think we should use a combination of an PJ internal error flag, and the system errno. The latter is thread local in threaded systems: It is really a macro expanding to a dereferencing of the return value of a function returning a pointer to the thread local errno (this makes &errno a syntax error on threaded systems).

When introducing an errno member of each PJ object, the only remaining need for a global pj_errno is really just to signal that a we could not allocate or initialize the PJ. Cases that may be adequately handled by returning 0 and setting errno to ENOMEM or EINVAL, respectively.

Of course setting the PJ->errno should be done using proj-internal macros passing stuff through to the global and context local pj_errnos.

Checking validity of a PJ could, at the API level be done by a PJ_OK(P) macro or somesuch (incidentally, "pjok" is the Danish word for "wimp").

We should strive to make the OBS API pj_errno agnostic, widely ignoring the global and context wide pj_errno

Since pj_obs_api.[ch] and proj.h have largely been my own private playground, I will do some radical reorg work, introducing a new API namespace (proj_) and largely removing support for contexts and fileapis

@rouault
Copy link
Member Author

rouault commented Jun 24, 2017

Thread local storage requires explicit support from compilers, and at the time this was introduced in proj, I believe this was only for cutting edge compilers, hence the choice for using classic pthread / win32 thread.

I wouldn't be enthousiastic about using the standard errno mechanism for proj. errno can be reset by various calls to the standard library, etc in a way that is hard to predict. By using a context-specific pj_errno we master precisely when we set it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants