Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tickets/DM-43384 #846

Closed
wants to merge 9 commits into from
Closed

tickets/DM-43384 #846

wants to merge 9 commits into from

Conversation

jgates108
Copy link
Contributor

No description provided.

ChunkMapException(Context const& ctx, std::string const& msg) : util::Issue(ctx, msg) {}
};

class CzarChunkMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class needs to be documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is probably not the most recent commit that you are commenting on. 10+ lines of documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply followed a link to the PR from the ticket. At a time when I was reviewing the PR I didn't see any other commits. Did you do git push -f on your recent mods?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have found changes to the header in another commit titled:

Added unit test

That confused me since I followed the commit structure and expected that the class would be documented in a commit where the class was introduced.

Perhaps I should ignore the commit structure and review changes on the PR instead of commit-by-commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see the comments on the PR when I entered the previous message.

src/czar/CzarChunkMap.h Show resolved Hide resolved
src/czar/CzarChunkMap.h Outdated Show resolved Hide resolved
public:
using Ptr = std::shared_ptr<ChunkData>;
ChunkData(int chunkId_) : _chunkId(chunkId_) {}
int64_t const _chunkId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public members of classes can't start with _. The same problem seems to exist for all members of both classes: ChunkData and WorkerChunksData.

The second comment is regarding a choice of type for chunk numbers. Chunks are small positive numbers starting with 0. Values of the numbers do not exceed 1 million. I can't even see a scenario when the chunk number would be 1 billion. The right type here would be unsigned int or uint32_t.

/* &&&
QMeta::ChunkMap QMetaMysql::getChunkMap() { // &&&
lock_guard<mutex> sync(_dbMutex);
>>>>>>> 07d082050 (Added code to read chunk disposition map and organize for czar use.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because of the unresolved conflict during rebase against the "feature branch"?

void addToWorkerHasThis(std::shared_ptr<WorkerChunksData> const& worker);

/// Key is databaseName+tableName, value is size in bytes.
std::map<std::pair<std::string, std::string>, int64_t> _dbTableMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may replace the generic std::pair with something more specific and friendlier to the code readers, and which would be easier to track in the rest of the code:

struct DatabaseTable {
    std::strng database;
    std::strng table;
};

The second comment is that sizes in C++ are usually expressed using unsigned integer types std::size_t, std::uint64_t, etc. Negative values make no sense for the size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::pair works as a key in a map as long as both items are comparable. Struct can be done, but the comparison needs to be defined. I'm not sure it merits the effort here as database name + table name is a normal pairing for database tables.

class WorkerChunksData {
public:
using Ptr = std::shared_ptr<WorkerChunksData>;
WorkerChunksData(std::string const& wId) : _workerId(wId) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't very important. However, it's always easier to read a code where there is a direct mapping of the construction parameters to the corresponding members like in:

 WorkerChunksData(std::string const& workerId) : _workerId(workerId) {}

...unless you're going to have the public/protected method like:

std::string const& workerId() const { return _workerId; }

In the latter case, the following usually works well:

WorkerChunksData(std::string const& workerId_) : _workerId(workerId_) {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed name to workerId.


/// Map of chunks this worker will handle during shared scans.
std::map<int, ChunkData::Ptr> _sharedScanChunkMap;
int64_t _sharedScanTotalSize = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size is usually expressed using unsigned integer types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed


class WorkerChunksData;

class ChunkData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a public nested class that has all its members public may be a big problem. The same comment applies to the relevant class WorkerChunksData. You allow any user of the class CzarChunkMap to mess with the internal state of CzarChunkMap as well as with internal states of ChunkData and WorkerChunksData. This design violates many key principles of the Object-oriented programming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the members are private with no method to change their values. Aside from calcTotalBytes, which I'm making private now.

CzarChunkMap can only set the values because it a friend to this class. Nothing else should be able to alter these. This applies to all nested classes in CzarChunkMaps.

The underlying issue is that these maps will be used heavily. Making copies or locking mutexes will be expensive, so the nested classes need to be exposed for rapid cheap access, but need to be read only for safety. The friend class seemed to be the best way to do that. const isn't really an option for _totalBytes or _primaryScanWorker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can see it now after reviewing other changes made to these classes in the second commit:

Added unit test.

So, please, disregard my comments.

return false;
}

auto const& jsChunks = qChunkMap.chunks;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the chunk map was migrated away from the JSON library.

Copy link
Contributor

@iagaponenko iagaponenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

ChunkMapException(Context const& ctx, std::string const& msg) : util::Issue(ctx, msg) {}
};

class CzarChunkMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have found changes to the header in another commit titled:

Added unit test

That confused me since I followed the commit structure and expected that the class would be documented in a commit where the class was introduced.

Perhaps I should ignore the commit structure and review changes on the PR instead of commit-by-commit?


class WorkerChunksData;

class ChunkData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can see it now after reviewing other changes made to these classes in the second commit:

Added unit test.

So, please, disregard my comments.

@jgates108
Copy link
Contributor Author

This PR was made before the tickets/DM-43715 was created. Since that branch is still the same as main, the changes in the PR are the same, but this seems to have other effects on the PR.

@fritzm fritzm closed this Aug 5, 2024
@fritzm fritzm deleted the tickets/DM-43384 branch August 5, 2024 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants