Tickets/dm 8405 #3

iagaponenko · 2017-02-01T23:57:58Z

No description provided.

sph-layout: spherical partitions explorer sph-duplicate2: an advanced version of the chunk duplicator Added dependedcies on package 'sphgeom' for generating HTM ID for levels higher than 13. This was needed by the new duplicator.

of the specified partition.

into the lower 32-bit part of the keys.

andy-slac

Looks mostly OK, there are few things that can be improved:

throw new needs to be replaced with throw
curly brace initializers are out of favor, I think it's better to replace with parentheses
boost::lexical_cast<std::string> could be replaced with std::to_string where possible
There are other minor comments you can llok at.

andy-slac · 2017-02-03T18:05:45Z

ups/partition.cfg

-    "required": ["boost_system", "boost_filesystem", "boost_thread", "boost_program_options"],
-    "buildRequired": ["boost_test"],
+    "required": ["boost_system", "boost_filesystem", "boost_thread", "boost_program_options", "sphgeom"],
+    "buildRequired": ["boost_test", "sphgeom"],


I think buildRequired is an addition to required for build time, if you have dependency in required there no need to also put it into buildRequired.

andy-slac · 2017-02-03T18:06:44Z

src/sph-layout.cc

+        return EXIT_FAILURE;
+    }
+    return EXIT_SUCCESS;
+}


github shows that newline is missing after last character

andy-slac · 2017-02-03T18:07:19Z

src/sph-duplicate2.cc

+        return EXIT_FAILURE;
+    }
+    return EXIT_SUCCESS;
+}


missing newline after last character

andy-slac · 2017-02-03T18:09:46Z

src/sph-layout.cc

@@ -0,0 +1,225 @@
+/*
+ * LSST Data Management System
+ * Copyright 2013 LSST Corporation.


andy-slac · 2017-02-03T18:21:32Z

src/sph-layout.cc

+#include <utility>
+#include <vector>
+#include <map>
+#include <cmath>


includes should be sorted

andy-slac · 2017-02-03T19:33:28Z

src/sph-duplicate2.cc

+    std::set<uint64_t> objIdOutOfBox;
+
+    /// Duplicate the next row of the chunk's Object table
+    size_t duplicateObjectRow (std::string              & line,


andy-slac · 2017-02-03T19:36:32Z

src/sph-duplicate2.cc

+
+        int idx = 0;
+        for (const std::string token : tokens) {  
+            if      (coldefObject.idxDeepSourceId == idx) { deepSourceId = boost::lexical_cast<uint64_t>(token); }


Is not it easier to write

uint64_t deepSourceId = boost::lexical_cast<uint64_t>(tokens[coldefObject.idxDeepSourceId]);

without looping over whole thing?

andy-slac · 2017-02-03T19:41:29Z

src/sph-duplicate2.cc

+        // Then update the row and store the updated row as well.
+
+        if (opt.storeInput) {
+            tokens[coldefObject.idxDeepSourceId] = boost::lexical_cast<std::string> (newInputDeepSourceId);


std::to_string() is probably better for to-string conversion than boost::lexical_cast<std::string>, at least it is shorter

andy-slac · 2017-02-03T19:56:53Z

src/sph-duplicate2.cc

+        size_t numProcessed = 0,
+               numRecorded  = 0;
+
+        std::ifstream  infile {  inFileName, std::ifstream::in };


std::ifstream::in is default, no need to mention it explicitely

andy-slac · 2017-02-03T20:04:22Z

src/sph-duplicate2.cc

+            tokens[colnum++] = token;
+        }
+        if (colnum != tokens.size())
+            throw new std::range_error("too few tokens in a row of the input Object file");


If you need to split a line into tokens then Boost has a string algorithm library which is probably much more efficient that doing it via stringstream: http://www.boost.org/doc/libs/1_60_0/doc/html/string_algo/usage.html#idm45555128601440

iagaponenko

Addressed all concerns.

iagaponenko · 2017-02-03T22:29:19Z

src/sph-layout.cc

@@ -0,0 +1,225 @@
+/*
+ * LSST Data Management System
+ * Copyright 2013 LSST Corporation.


iagaponenko · 2017-02-03T22:29:33Z

src/sph-layout.cc

+#include <utility>
+#include <vector>
+#include <map>
+#include <cmath>


iagaponenko · 2017-02-03T22:29:39Z

src/sph-layout.cc

+#include "boost/shared_ptr.hpp"
+
+#include "lsst/partition/Chunker.h"
+


iagaponenko · 2017-02-03T22:29:46Z

src/sph-layout.cc

+
+        Chunk2WorkerMap result;
+
+        std::ifstream infile {filename,  std::ifstream::in};


iagaponenko · 2017-02-03T22:29:52Z

src/sph-layout.cc

+            int32_t chunk;
+            std::string node;
+            is >> chunk;
+            is >> node;


iagaponenko · 2017-02-03T22:30:16Z

src/sph-layout.cc

+
+        ("chunk2worker", po::value<std::string>(), "Chunk-to-worker map.")
+
+        ("chunk", po::value<std::vector<int32_t>>(), "Chunk identifier.");


iagaponenko · 2017-02-03T22:31:17Z

src/sph-layout.cc

+            vm.count("chunk") ? vm["chunk"].as<std::vector<int32_t>>() : std::vector<int32_t>();
+
+        if (chunks.empty())
+            for (int32_t chunkId = 0 ; chunkId < 20000; ++chunkId)


Re-implemented by adding two command line parameters to specify a range of chunk IDs in case if no specific IDs were passed to the application.

iagaponenko · 2017-02-03T22:31:24Z

src/sph-layout.cc

+            std::cout << "  chunk2worker size: " << chunk2worker.size() << "\n";
+        }
+
+        part::Chunker chunker{overlap, numStripes, numSubStripesPerStripe};


iagaponenko · 2017-02-03T22:31:29Z

src/sph-layout.cc

+        return EXIT_FAILURE;
+    }
+    return EXIT_SUCCESS;
+}


iagaponenko · 2017-02-03T22:31:35Z

ups/partition.cfg

-    "required": ["boost_system", "boost_filesystem", "boost_thread", "boost_program_options"],
-    "buildRequired": ["boost_test"],
+    "required": ["boost_system", "boost_filesystem", "boost_thread", "boost_program_options", "sphgeom"],
+    "buildRequired": ["boost_test", "sphgeom"],


iagaponenko · 2017-02-03T22:35:34Z

The only request which I chose to refuse is regarding boost::lexical_caststd::string with std::to_string. The later truncates DOUBLE values to just a few digits after dot. This is not acceptable for this application because it's supposed to preserve the original precision of the duplicated contents.

andy-slac

I approve :)

iagaponenko added 5 commits December 8, 2016 22:24

Implemented two new tools:

f7d0b20

sph-layout: spherical partitions explorer sph-duplicate2: an advanced version of the chunk duplicator Added dependedcies on package 'sphgeom' for generating HTM ID for levels higher than 13. This was needed by the new duplicator.

Filtering objects (and dependents) not falling within limits

ddc316a

of the specified partition.

Improved the Primary key generator by mixing in the chunk number

69b9bc9

into the lower 32-bit part of the keys.

Fixed a bug in the PK generator

fd241fe

minor refactoring of tools: sph-duplicate2 and sph-layout

0e701f0

andy-slac requested changes Feb 3, 2017

View reviewed changes

iagaponenko commented Feb 3, 2017

View reviewed changes

minor code changes to comply with LSST coding guidelines

a8b0c16

andy-slac approved these changes Feb 3, 2017

View reviewed changes

iagaponenko merged commit 721d0e5 into master Feb 3, 2017

ktlim deleted the tickets/DM-8405 branch August 25, 2018 03:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tickets/dm 8405 #3

Tickets/dm 8405 #3

iagaponenko commented Feb 1, 2017

andy-slac left a comment

andy-slac Feb 3, 2017

iagaponenko Feb 3, 2017

andy-slac Feb 3, 2017

iagaponenko Feb 3, 2017

andy-slac Feb 3, 2017

andy-slac Feb 3, 2017

iagaponenko Feb 3, 2017

andy-slac Feb 3, 2017

iagaponenko Feb 3, 2017

andy-slac Feb 3, 2017

andy-slac Feb 3, 2017

andy-slac Feb 3, 2017

andy-slac Feb 3, 2017

andy-slac Feb 3, 2017

iagaponenko left a comment

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko Feb 3, 2017

iagaponenko commented Feb 3, 2017

andy-slac left a comment

		#include "boost/shared_ptr.hpp"

		#include "lsst/partition/Chunker.h"


		Chunk2WorkerMap result;

		std::ifstream infile {filename, std::ifstream::in};


		("chunk2worker", po::value<std::string>(), "Chunk-to-worker map.")

		("chunk", po::value<std::vector<int32_t>>(), "Chunk identifier.");

Tickets/dm 8405 #3

Tickets/dm 8405 #3

Conversation

iagaponenko commented Feb 1, 2017

andy-slac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iagaponenko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iagaponenko commented Feb 3, 2017

andy-slac left a comment

Choose a reason for hiding this comment