thread-safety #13

gauteh · 2020-01-13T18:37:17Z

Does this project support thread-safe reading of HDF5 files?

jgallagher59701 · 2020-01-13T19:21:45Z

On Jan 13, 2020, at 11:37, Gaute Hope ***@***.***> wrote: Does this project support thread-safe reading of HDF5 files?

Depends. It makes use of the STL, so it’s not internally thread safe. The BES, the framework that the handler currently runs in, is not itself _generally_ thread safe, but does use multiple threads for I/O operations. In that sense, it is thread safe. This is similar to ‘apartment thread safety.’ James

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#13?email_source=notifications&email_token=AB7Q4KQ6WCDTOBJNOXFK5P3Q5SYF5A5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IF2YCFQ>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7Q4KW4KMSWRI5CP5XVW4DQ5SYF5ANCNFSM4KGG4ATQ>.

— James Gallagher jgallagher@opendap.org

gauteh · 2020-01-13T19:28:21Z

Thanks, I've been experimenting with a lightweight, async, rust implementation of a DAP2 server. The ambition is to only support serving simple DAP, no catalog etc. Performance is similar to hyrax for sequential reads (with no caching etc), but it streams responses so does not require much memory (except caching meta-data currently). Concurrent data-reads (tested with e.g. autocannon or wrk) suffers from the global locks necessary in netcdf and HDF5 libs, while metadata is quite fast (70k requests/sec for DAS).

If hyrax has a thread-safe interface to HDF5 (at least for reads) that would greatly improve concurrent performance.

https://github.com/gauteh/dars

jgallagher59701 · 2020-01-13T19:30:43Z

On Jan 13, 2020, at 12:28, Gaute Hope ***@***.***> wrote: Thanks, I've been experimenting with a lightweight, async, rust implementation of a DAP2 server. The ambition is to only support serving simple DAP, no catalog etc. Performance is similar to hyrax for sequential reads, but it streams responses so does not require much memory (except caching meta-data currently). Concurrent data-reads (tested with e.g. autocannon or wrk) suffers from the global locks necessary in netcdf and HDF5 libs, while metadata is quite fast (70k requests/sec for DAS).

Very cool!!

If hyrax has a thread-safe interface to HDF5 (at least for reads) that would greatly improve concurrent performance.

I think that for it should be safe in that context. How would this relate to using the code as an AWS Lambda function?

https://github.com/gauteh/dars <https://github.com/gauteh/dars> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13?email_source=notifications&email_token=AB7Q4KT6EAFWXG7F6UF5ZR3Q5S6FNA5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIZ7FCY#issuecomment-573829771>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7Q4KXZBZO6NWFGOOARBJLQ5S6FNANCNFSM4KGG4ATQ>.

— James Gallagher jgallagher@opendap.org

gauteh · 2020-01-13T19:39:16Z

I think that for it should be safe in that context. How would this relate to using the code as an AWS Lambda function?

I haven't looked much into that. But rust works on AWS lambda. And from briefly looking at this guide they also use the tokio runtime, which is what I am using for main. So my handler routine could potentially be plugged in there (with some adaption). I don't know how AWS lambda functions access files. There is some global state stored in the memory. I suspect this would be better to keep in a separate service (redis or something) and let the lambdas fetch from there. As far as I understand AWS lambdas or e.g. cloudflare are more geared towards a microservice setup?

This is currently about 1500 lines of code, so anything is possible.

If there is any way this could be used / incorporated in the opendap ecosystem that would be very interesting.

jgallagher59701 · 2020-01-13T21:08:56Z

On Jan 13, 2020, at 12:39, Gaute Hope ***@***.***> wrote: I think that for it should be safe in that context. How would this relate to using the code as an AWS Lambda function? I haven't looked much into that. But rust works on AWS lambda. And from briefly looking at this guide <https://aws.amazon.com/blogs/opensource/rust-runtime-for-aws-lambda/> they also use the tokio runtime, which is what I am using for main. So my handler routine could potentially be plugged in there (with some adaption). I don't know how AWS lambda functions access files. There is some global state stored in the memory. I suspect this would be better to keep in a separate service (redis or something) and let the lambdas fetch from there. As far as I understand AWS lambdas or e.g. cloudflare are more geared towards a microservice setup?

Well, I’ve been thinking of accessing data stored in S3. We have code to do that.

This is currently about 1500 lines of code, so anything is possible. If there is any way this could be used / incorporated in the opendap ecosystem that would be very interesting.

I wonder what would be the best way? I am looking for a student intern and it could be a great project, with the caveat that I know zero rust…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13?email_source=notifications&email_token=AB7Q4KQQGGPYTZXH7BU7OE3Q5S7OJA5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2AKXY#issuecomment-573834591>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7Q4KT7YUXNCYWTEDTEJKDQ5S7OJANCNFSM4KGG4ATQ>.

— James Gallagher jgallagher@opendap.org

gauteh · 2020-01-14T11:56:38Z

On Jan 13, 2020, at 12:39, Gaute Hope @.***> wrote: I think that for it should be safe in that context. How would this relate to using the code as an AWS Lambda function? I haven't looked much into that. But rust works on AWS lambda. And from briefly looking at this guide https://aws.amazon.com/blogs/opensource/rust-runtime-for-aws-lambda/ they also use the tokio runtime, which is what I am using for main. So my handler routine could potentially be plugged in there (with some adaption). I don't know how AWS lambda functions access files. There is some global state stored in the memory. I suspect this would be better to keep in a separate service (redis or something) and let the lambdas fetch from there. As far as I understand AWS lambdas or e.g. cloudflare are more geared towards a microservice setup?

Well, I’ve been thinking of accessing data stored in S3. We have code to do that.

This is currently about 1500 lines of code, so anything is possible. If there is any way this could be used / incorporated in the opendap ecosystem that would be very interesting.

I wonder what would be the best way? I am looking for a student intern and it could be a great project, with the caveat that I know zero rust...

Absolutely! I did not start Rust too long ago, but I think for this type of service it is really well suited. Especially the safety in concurrency and memory which tend to be very difficult to get right in C++/C when writing this type of code, but still similar performance to C++. Go is probably in the same niche, but less memory safe/RC-safe.

I think that to both support an amazon lambda setup and more traditional load-balanced servers things need to be split up a bit more. But since amazon lambda also uses tokio for async functions this should be possible to do in an efficient and clean way.

magnusuMET · 2020-01-14T22:02:03Z

Do you require a thread-safe hdf5 installation? If not, how do you synchronize accesses when calling into hdf5?

jgallagher59701 · 2020-01-15T00:51:19Z

On Jan 14, 2020, at 15:02, magnusuMET ***@***.***> wrote: Do you require a thread-safe hdf5 installation? If not, how do you synchronize accesses when calling into hdf5?

The OPeNDAP server only reads HDF5, so there’s no need to synchronize the accesses.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13?email_source=notifications&email_token=AB7Q4KX4OYJGBL6NXNO2NILQ5YY5ZA5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI6JEPY#issuecomment-574394943>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7Q4KWTUX7UVCLWBKTBEETQ5YY5ZANCNFSM4KGG4ATQ>.

— James Gallagher jgallagher@opendap.org

gauteh · 2020-01-17T09:42:59Z

: Do you require a thread-safe hdf5 installation? If not, how do you synchronize accesses when calling into hdf5?

The OPeNDAP server only reads HDF5, so there’s no need to synchronize the accesses.

If I understand correctly you have your own implementation of a HDF5 reader? Which is reasonably thread-safe for reads (e.g. no global buffers?). This is very useful, since the official HDF5 library is not thread-safe even for reads. It is relatively easy to crash or invalidate data by stress-testing the official HDF5-library.

The official HDF5 library has the option to compile with global-locking, making it thread-safe by only ensuring it to be used sequentially. This does not really help on performance, as it is the same you have to implement (relatively easily) as a user of the HDF5 (or netCDF) library which is not thread-safe. This does of course not allow concurrent access, it just synchronizes the access at library-level.

jgallagher59701 · 2020-01-17T18:54:33Z

On Jan 17, 2020, at 02:42, Gaute Hope ***@***.***> wrote: : Do you require a thread-safe hdf5 installation? If not, how do you synchronize accesses when calling into hdf5? The OPeNDAP server only reads HDF5, so there’s no need to synchronize the accesses. If I understand correctly you have your own implementation of a HDF5 reader? Which is reasonably thread-safe for reads (e.g. no global buffers?). This is very useful, since the official HDF5 library is not thread-safe even for reads. It is relatively easy to crash or invalidate data <georust/netcdf#43 (comment)> by stress-testing the official HDF5-library. The official HDF5 library has the option to compile with global-locking, making it thread-safe by only ensuring it to be used sequentially. This does not really help on performance, as it is the same you have to implement (relatively easily) as a user of the HDF5 (or netCDF) library which is not thread-safe. This does of course not allow concurrent access, it just synchronizes the access at library-level.

We do have code that can read HDF5 without using the hdf5 library. It’s called ‘DMR++.’ in our code base it’s also a handler and can be found at bes/modules/dmrpp_module. It uses a special ‘map’ file that is built by a tool called build_dmrpp, although the actual use is a bit more complex. The dmrpp_module can read any (many?) hdf5 files and do so fairly efficiently. It can also do this when the file is stored in S3, only reading the data that’s needed for the DAP request.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13?email_source=notifications&email_token=AB7Q4KVXA2G2JVIEE44OKA3Q6F4SHA5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJHDRTQ#issuecomment-575551694>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7Q4KTC6MHT6Q7AU4Y6BSLQ6F4SHANCNFSM4KGG4ATQ>.

— James Gallagher jgallagher@opendap.org

gauteh · 2020-01-23T08:09:17Z

That would be very useful. If you find a student interested in dars. Is the primary use-case an AWS deployed service? Using S3? In our case this would be very useful as well, but a more traditional setup should also be supported (file system). Currently, metadata is cached, but XDR/Dods is not. It might not be necessary with a multi-threaded HDF5-reader, but otherwise some caching mechanism has to be supported there as well (I could not really find a good proxy to put in between, but that might be possible, it would not be able to make use of overlapping data-requests though). With a multi-threaded HDF5 the main performance gain from caching would be either from memory-caching (redis / memcached) and from not having to convert to XDR, and possibly file-system issues if the data store is slow. In any case, to be able to support these three use cases:

s3
traditional without caching (e.g. a small server only requiring a single instance of the main server)
traditional with caching

things need to be split up in a library in a sensible way so that possibly two or three targets can built. I think that building one target to fit these three in one will result in so independent branches that they are essentially different programs. Do you have any thoughts on this?

gauteh · 2020-01-25T10:32:41Z

I've look at the DMR++ module trying to understand how it is built up, and I am now appropriately confused. One thing I was wondering about: it seems that everything goes through libcurl, from looking at dmrpp_module/data/README.md it seems that local files are also accessed through libcurl using file:// URLs?

The same README.md also mention something about 2 or 3 times smaller size files, does that mean that the DMR++ files also contain data? It seems like a large size still for metadata and chunk-maps (is the map file the DMR++ file?)?

It would maybe make sense for me to support DMR++ files.

ndp-opendap · 2020-01-25T12:36:20Z

I've look at the DMR++ module trying to understand how it is built up, and I am now appropriately confused. One thing I was wondering about: it seems that everything goes through libcurl, from looking at dmrpp_module/data/README.md it seems that local files are also accessed through libcurl using file:// URLs?

Yes, it uses libcurl for access.
Yes, file URLs work for range access of local files. This might be replaced with specific file pointer based code in an effort to improve performance, but since curl just "does it" we rolled with it in the short term.

The same README.md also mention something about 2 or 3 times smaller size files, does that mean that the DMR++ files also contain data? It seems like a large size still for metadata and chunk-maps (is the map file the DMR++ file?)?

The dmr++ files contain all of the source file's syntactic and semantic metadata in addition to the chunk-maps. In many cases the semantic metadata of NASA data products is quite large. No actual data values are stored in the dmr++ files.

Nathan

gauteh · 2020-01-25T18:16:47Z

Thanks, that makes sense.

ndp-opendap · 2020-01-27T15:33:26Z

I should have pointed out that this arrangement allows the server to construct all of the DAP2/4 metadata responses (.ddx, .dds, .dmr, etc) without interrogating the source hdf5/nc4 file.

gauteh · 2020-01-27T15:41:20Z

Thanks, one thing I was wondering about: I could not see any code dealing with endianness, hdf5 could be LE, while the data is just cast to dods_* which should always be BE? man. 27. jan. 2020, 16:33 skrev Nathan Potter <notifications@github.com>:

…

I should have pointed out that this arrangement allows the server to construct all of the DAP2/4 metadata responses (.ddx, .dds, .dmr, etc) without interrogating the source hdf5/nc4 file. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#13?email_source=notifications&email_token=AAAN366MNDOFZ6ZZV6BVPTDQ735ENA5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ75XOA#issuecomment-578804664>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAN36Y2IX2AEUVDK5AES2TQ735ENANCNFSM4KGG4ATQ> .

jgallagher59701 · 2020-01-28T16:49:23Z

On Jan 27, 2020, at 8:41 AM, Gaute Hope ***@***.***> wrote: Thanks, one thing I was wondering about: I could not see any code dealing with endianness, hdf5 could be LE, while the data is just cast to dods_* which should always be BE?

We assume the HDF5 granules endianness matches that of the host. The transport code in libdap handles transformation to network byte order for DAP2 (DAP4 uses reader-make-right for the byte order so it’ll just send it as little endian).

…

man. 27. jan. 2020, 16:33 skrev Nathan Potter ***@***.***>: > I should have pointed out that this arrangement allows the server to > construct all of the DAP2/4 metadata responses (.ddx, .dds, .dmr, etc) > without interrogating the source hdf5/nc4 file. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#13?email_source=notifications&email_token=AAAN366MNDOFZ6ZZV6BVPTDQ735ENA5CNFSM4KGG4AT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ75XOA#issuecomment-578804664>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAN36Y2IX2AEUVDK5AES2TQ735ENANCNFSM4KGG4ATQ> > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

-- James Gallagher jgallagher@opendap.org

gauteh mentioned this issue Jan 14, 2020

Investigate where netcdf-c breaks thread safety georust/netcdf#43

Closed

gauteh mentioned this issue Jan 18, 2020

concurrent and thread-safe reads from netCDF / HDF5 gauteh/dars#5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thread-safety #13

thread-safety #13

gauteh commented Jan 13, 2020

jgallagher59701 commented Jan 13, 2020 via email

gauteh commented Jan 13, 2020 •

edited

Loading

jgallagher59701 commented Jan 13, 2020 via email

gauteh commented Jan 13, 2020

jgallagher59701 commented Jan 13, 2020 via email

gauteh commented Jan 14, 2020

magnusuMET commented Jan 14, 2020

jgallagher59701 commented Jan 15, 2020 via email

gauteh commented Jan 17, 2020

jgallagher59701 commented Jan 17, 2020 via email

gauteh commented Jan 23, 2020

gauteh commented Jan 25, 2020

ndp-opendap commented Jan 25, 2020 •

edited

Loading

gauteh commented Jan 25, 2020

ndp-opendap commented Jan 27, 2020

gauteh commented Jan 27, 2020 via email

jgallagher59701 commented Jan 28, 2020 via email

thread-safety #13

thread-safety #13

Comments

gauteh commented Jan 13, 2020

jgallagher59701 commented Jan 13, 2020 via email

gauteh commented Jan 13, 2020 • edited Loading

jgallagher59701 commented Jan 13, 2020 via email

gauteh commented Jan 13, 2020

jgallagher59701 commented Jan 13, 2020 via email

gauteh commented Jan 14, 2020

magnusuMET commented Jan 14, 2020

jgallagher59701 commented Jan 15, 2020 via email

gauteh commented Jan 17, 2020

jgallagher59701 commented Jan 17, 2020 via email

gauteh commented Jan 23, 2020

gauteh commented Jan 25, 2020

ndp-opendap commented Jan 25, 2020 • edited Loading

gauteh commented Jan 25, 2020

ndp-opendap commented Jan 27, 2020

gauteh commented Jan 27, 2020 via email

jgallagher59701 commented Jan 28, 2020 via email

gauteh commented Jan 13, 2020 •

edited

Loading

ndp-opendap commented Jan 25, 2020 •

edited

Loading