Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TileDB plugin #2363

Merged
merged 3 commits into from Feb 13, 2019
Merged

TileDB plugin #2363

merged 3 commits into from Feb 13, 2019

Conversation

@normanb
Copy link
Contributor

normanb commented Feb 8, 2019

This plugin adds TileDB support for point clouds. For ease of testing there is a Dockerfile in a different branch.

A sample pipeline for writing


{
  "pipeline":[
    {
      "type":"readers.las",
      "filename":"ot_C320000_4172000.laz"
    },
    {
      "type":"writers.tiledb",
      "config_file":"tiledb.config",
      "compression":"zstd",
      "compression_level": 75,
      "array_name":"sample_array"
    }
  ]
}

and a config file for TileDB

sm.dedup_coords true

To write to a S3 bucket change the TileDB config file to be;

sm.dedup_coords true
vfs.s3.scheme https
vfs.s3.region us-east-1
vfs.s3.endpoint_override
vfs.s3.use_virtual_addressing true

and the array_name in the pipeline to be s3://mybucket/sample_array

More detail on S3 and TileDB here

@hobu hobu added this to the 1.9 milestone Feb 8, 2019
SET(TILEDB_VERSION "${TILEDB_VERSION_MAJOR}.${TILEDB_VERSION_MINOR}.${TILEDB_VERSION_PATCH}"
CACHE INTERNAL "The version string for TileDB library")

IF (TILEDB_VERSION VERSION_LESS TILEDB_FIND_VERSION)

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

We've seen some issues with finding version X of header files and version Y of libraries. I don't know if there's a canonical way to deal with this kind of thing in these cmake find scripts, but if there's a way to check the library version as well, it could save some hassle.

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

This isn't possible currently, but will be. Depending on when this PR gets merged I can add that check.

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 13, 2019

Contributor

Seems hard to do. The FindPythonLibs module makes a good attempt. I may try to recreate some of that for a couple of other modules.


if(TILEDB_FOUND)
set(CMAKE_REQUIRED_LIBRARIES "${TILEDB_LIBRARIES}")
include_directories(${TILEDB_INCLUDE_DIRS})

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

We are making an effort to move all such cmake directives to be target-specific. In this case, you probably only need to include it for the plugin targets.

This comment has been minimized.

Copy link
@normanb

normanb Feb 12, 2019

Author Contributor

👍

LINK_WITH
${TILEDB_LIBRARIES}
)
target_include_directories(${reader_libname} PRIVATE

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

Looks like you already have include directories here, so the directive in tiledb.cmake should just go away.

This comment has been minimized.

Copy link
@normanb

normanb Feb 12, 2019

Author Contributor

👍


template <typename T>
void makeBuffer(tiledb::Query& query, std::map<std::string, pdalboost::any>& mp, const std::string& key,
const unsigned bufSize)

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

We're trying very hard to eliminate boost, so perhaps we can find a way to eliminate the boost::any use. I can look it over more closely and see if I can see a reasonable alternative if you like.

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

Yes, that would be helpful, thanks. I used any because I needed a map of different types.

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 13, 2019

Contributor

I understand why you did it. I'll merge this and then make a PR against it that you can look over.


for (const auto& dim: dims)
{
layout->registerDim(Dimension::id(dim.name()));

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

Dimension::id() returns the ID of KNOWN dimensions. I don't know if you're expecting your data to only contain names of dimensions known to PDAL, but if not, you should use registerOrAssignDim() OR you can do something like:

Dimension::Id id = Dimension::id("name");
if (id != Dimension::Id::Unknown)
  layout->registerDim(id);
else
  layout->assignDim("name", SomeType); 

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

👍

Dimension::Id id = Dimension::id(attrName);
Dimension::Type t = view->dimType(id);

switch (t)

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

I'm not sure exactly what you're trying to do here. You don't need to switch on PDAL's type and cast. PDAL will return whatever type your writer wants, without regard to the way the data is stored in PDAL. If you want to write your data as doubles, just fetch them as doubles with "getFieldAs(...)". You don't need to worry about the internal type at all.

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

I was aiming to write the data in the type it is stored in. I don't want to use more storage by assuming doubles.

{
args.add("array_name", "TileDB array name", m_arrayName).setPositional();
args.add("config_file", "TileDB configuration file location", m_cfgFileName);
args.add("data_tile_capacity", "TileDB tile capacity", m_tile_capacity, unsigned(100000));

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

size_t instead of unsigned?

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

👍

else
{
throwError("TileDB requires the X,Y,Z dimensions to be the same type.");
}

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

You don't need this check. Unless you have done something strange, X, Y and Z are always stored as doubles. Furthermore, even if they aren't you can fetch the data as doubles, as you're doing below.

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

👍

for (const auto& kv : mapAttrVals)
{
Dimension::Id id = Dimension::id(kv.first);
Dimension::Type t = view->dimType(id);

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

Similar comments here as applied to the reader.

// check the sidecar exists
EXPECT_TRUE(pdal::Utils::fileExists(sidecar));
}
}

This comment has been minimized.

Copy link
@abellgithub

abellgithub Feb 11, 2019

Contributor

It would be nice if the tests made sure that data was written as expected and read as expected.

This comment has been minimized.

Copy link
@normanb

normanb Feb 11, 2019

Author Contributor

I will add more to the tests.

@normanb

This comment has been minimized.

Copy link
Contributor Author

normanb commented Feb 12, 2019

@abellgithub I have made the suggested changes. I made a couple of comments ^^. There are two remaining issues

  1. A map of different types (should this is a utility outside of this driver)
  2. I am intentionally mapping pdal types to tiledb types to reduce storage (rather than assuming all doubles).
Copy link
Contributor

abellgithub left a comment

Do you mean to commit the .tdb files? I don't see them referenced anywhere.

@normanb

This comment has been minimized.

Copy link
Contributor Author

normanb commented Feb 13, 2019

@abellgithub yes, the .tdb files are part of the array used in the reader test -

std::string pth(Support::datapath("tiledb/array"));

@abellgithub

This comment has been minimized.

Copy link
Contributor

abellgithub commented Feb 13, 2019

I figured I was missing something. I just didn't see them directly referenced and I know nothing about tiledb.

@abellgithub abellgithub merged commit efa1d8e into PDAL:master Feb 13, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.