New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Class D support #469

Closed
wants to merge 13 commits into
base: develop
from

Conversation

Projects
None yet
3 participants
@zathras-crypto
Copy link

zathras-crypto commented May 9, 2017

This PR adds Class D support to Omni Core.

Class D is not defined in the spec but can be summarized as follows:

Class D is identical to Class C with the following changes:

  • The marker bytes have been shortened to ol rather than omni (@dexX7 I should have listened to you earlier hehe)
  • Integers in payloads are compressed with variable length encoding similar to LEB128

The use of Class D is significantly more efficient than Class C (on average payloads are reduced by 40-50%). Class D is activated via the feature activation system and Omni Core will start sending Class D by default once the activation has occurred.

This is an initial PR to request feedback and should be considered in progress...

Thanks! :)
Z

@zathras-crypto zathras-crypto force-pushed the zathras-crypto:0.2-Z-ClassD branch from 332fa68 to 1ae9dd7 May 9, 2017

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented May 11, 2017

Awesome!

Back when we added Class C support, I made sure we can handle multiple push operations, and because multiple pushes in OP_RETURN are allowed for some time now, I believe we can safely make us of it.

Here is my proposal: instead of pushing [olpayload], we embed [ol][payload].

Ultimately this allows to set a bloom filter to ol, and identify Omni transactions in a SPV-ish way. This is just a gimick though, because we probably will never be ably to rely on SPV security.

What do you think? CC: @msgilligan

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented May 15, 2017

I actually really, really like that idea - SPV on it's own may not be viable but in combination with other methods (such as querying a consensus hash across multiple API services) there may be a possibility of a lower security approach, allowing users to choose their own tolerance for risk against convenience...

int nBlockNow = GetHeight();
bool fDataEnabled = GetBoolArg("-datacarrier", true);
bool fClassDEnabled = IsFeatureActivated(FEATURE_CLASS_D, nBlockNow);
if (nDataSize == 0) { // no compressed payload supplied

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

What's the specific purpose of this?

We don't use it in UseEncodingClassC.

This comment has been minimized.

@zathras-crypto

zathras-crypto May 16, 2017

This check is to make sure we fall back to Class C if there is no compressed payload available.

More detail:

WalletTxBuilder() has parameters for both a regular payload and a compressed payload.

There are certain functions that intentionally don't use compressed payloads (eg management functions like deactivation & alerting don't use Class D yet just in case we find a bug in it) and things like raw transaction creation also aren't Class D aware.

In these cases the compressed payload parameter passed to WalletTxBuilder() is empty, which means this check will cause UseEncodingClassD() to return false (and thus cause a fallback to Class C).

Hope that makes sense :)

I could move the check for an empty compressed payload into WalletTxBuilder() itself rather than UseEncodingClassD() if you think that is a more suitable place for it?

This comment has been minimized.

@dexX7

dexX7 May 16, 2017

Member

Aah, I see!

I could move the check for an empty compressed payload into WalletTxBuilder() itself rather than UseEncodingClassD() if you think that is a more suitable place for it?

Sounds good to me, I think. :)

if (encodingClass == OMNI_CLASS_D) {
int i = 0;

std::vector<uint8_t> vecVersionBytes = GetNextVarIntBytes(i);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Is this safe in any case, e.g. without payload and only marker? If I'm not mistaken, pkt[0] is accessed in that case, too, and without payload, this results in an out of bounds array access?

This comment has been minimized.

@zathras-crypto

zathras-crypto May 16, 2017

Great catch, you're right...

Do you think this is sufficient to catch all potential calls to GetNextVarIntBytes() that may overrun the end of the payload?

 std::vector<uint8_t> CMPTransaction::GetNextVarIntBytes(int &i) {
     std::vector<uint8_t> vecBytes;
 
+    if (i >= pkt_size) {
+        return vecBytes;
+    }
+
     do {
         vecBytes.push_back(pkt[i]);
         if (!IsMSBSet(&pkt[i])) break;

This comment has been minimized.

@dexX7

dexX7 May 16, 2017

Member

Looks good to me.

std::vector<uint8_t> vecVersionBytes = GetNextVarIntBytes(i);
std::vector<uint8_t> vecTypeBytes = GetNextVarIntBytes(i);

memcpy(&ecosystem, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Seems unsafe, if we don't check the length first!

This comment has been minimized.

@zathras-crypto

zathras-crypto May 16, 2017

Yep, you're right - fixed on each of the occurrences you noted (plus two more for early bird and percentage in interpret_CreatePropertyVariable()).

std::vector<uint8_t> vecAmountBytes = GetNextVarIntBytes(i);
std::vector<uint8_t> vecAmountDesiredBytes = GetNextVarIntBytes(i);

memcpy(&blocktimelimit, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Also seems unsafe, if we don't check the length of pkt first!

std::vector<uint8_t> vecMinFeeBytes = GetNextVarIntBytes(i);

if (version > MP_TX_PKT_V0) {
memcpy(&subaction, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Seems unsafe, if we don't check the length of pkt first.

std::vector<uint8_t> vecVersionBytes = GetNextVarIntBytes(i);
std::vector<uint8_t> vecTypeBytes = GetNextVarIntBytes(i);

memcpy(&ecosystem, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Seems unsafe, if we don't check the length of pkt first.

std::vector<uint8_t> vecVersionBytes = GetNextVarIntBytes(i);
std::vector<uint8_t> vecTypeBytes = GetNextVarIntBytes(i);

memcpy(&ecosystem, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Seems unsafe, if we don't check the length of pkt first.

}

int j = 0;
memcpy(category, spstr[j].c_str(), std::min(spstr[j].length(), sizeof(category)-1)); j++;

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Actually, we might ship a fix with class D encoding for "longer than allowed string sequences" here.

Right now it is possible to construct transactions, which have more characters than allowed in text fields. The extra characters are currently ignored, but it might be considered to invalidate such transactions.

This comment has been minimized.

@zathras-crypto

zathras-crypto May 16, 2017

I don't think this would offer much near term (as you couldn't fit >255 chars of metadata in a Class D anyway without support from a miner) but the current default nulldata limit may not always be the same so I do see why it might be worth doing...

In that case do you think we'd look at it as a malformed transaction and drop it completely (rather than parsing it and just marking it invalid)?

P.S. personally I'd like to abandon category and subcategory completely (I did in the Litecoin port hehe) and restrict the number of characters for the remaining metadata but that's a change that probably belongs in a different PR.

This comment has been minimized.

@dexX7

dexX7 May 16, 2017

Member

Good point, let's not mix it. If we want to tackle it, our prime candidate is probably indeed class B.

std::vector<uint8_t> vecVersionBytes = GetNextVarIntBytes(i);
std::vector<uint8_t> vecTypeBytes = GetNextVarIntBytes(i);

memcpy(&ecosystem, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Seems unsafe, if we don't check the length of pkt first.

std::vector<uint8_t> vecVersionBytes = GetNextVarIntBytes(i);
std::vector<uint8_t> vecTypeBytes = GetNextVarIntBytes(i);

memcpy(&ecosystem, &pkt[i], 1);

This comment has been minimized.

@dexX7

dexX7 May 15, 2017

Member

Seems unsafe, if we don't check the length of pkt first.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented May 15, 2017

Awesome, it roughly looks like this (from here).

if (txClass == OMNI_CLASS_D) {
  script << OP_RETURN << vchOmBytes << vchPayload;
} else {
  script << OP_RETURN << vchData;
}

... and we'd probably need to check, whether the other format is also <= nMaxDatacarrierBytes. Maybe via something like:

if (script.size()-1 > nMaxDatacarrierBytes) { return false; }

... at the very end, after constructing the script. This needs to be double checked, to make sure we're not off by one or two bytes due to the OP_RETURN or the push operations.


Another point we might consider:

Currently we're using a whole byte for the ecosystem, even though we'd get away with only one bit. Maybe we could save some space here as well?

I'm also wondering, whether we could save some space by handling property ids as [ecosystem bit][identifier], instead of [identifier with ecosystem offset].

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented May 16, 2017

This needs to be double checked, to make sure we're not off by one or two bytes due to the OP_RETURN or the push operations.

Ahhh... Hmm - that's something I didn't consider - we lose an extra byte for the second pushdata. I do really like the idea of separating the marker, but it depends if we're actually going to make use of that in some way (otherwise we're just burning a byte for nothing).

I'm still kind of leaning towards doing it anyway...

P.S. I did test just to be 100% sure, and the script size is always +1 byte with the second push as expected:

zathras@zlap:~/github/build/branches/classd$ grep testone  ~/.bitcoin/regtest/omnicore.log 
2017-05-16 03:31:20 payload size: 10, marker size: 2, testone size: 12, testwo size: 13
2017-05-16 03:31:20 payload size: 5, marker size: 2, testone size: 7, testwo size: 8
2017-05-16 03:31:21 payload size: 19, marker size: 2, testone size: 21, testwo size: 22
2017-05-16 03:31:21 payload size: 12, marker size: 2, testone size: 14, testwo size: 15
2017-05-16 03:31:21 payload size: 6, marker size: 2, testone size: 8, testwo size: 9
2017-05-16 03:31:21 payload size: 5, marker size: 2, testone size: 7, testwo size: 8
2017-05-16 03:31:21 payload size: 58, marker size: 2, testone size: 60, testwo size: 61
2017-05-16 03:31:21 payload size: 42, marker size: 2, testone size: 44, testwo size: 45
2017-05-16 03:31:21 payload size: 56, marker size: 2, testone size: 58, testwo size: 59
2017-05-16 03:31:21 payload size: 7, marker size: 2, testone size: 9, testwo size: 10
@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented May 16, 2017

Currently we're using a whole byte for the ecosystem, even though we'd get away with only one bit. Maybe we could save some space here as well?

That's an interesting idea and you're right (divisibility could be expressed in just one bit too and maybe other fields) but I'm not sure how to put that to work in practice...

I'm also wondering, whether we could save some space by handling property ids as [ecosystem bit][identifier], instead of [identifier with ecosystem offset].

Currently the variable length integer encoding is poor for test ecosystem properties, but I figured since they form less than 0.1% of our transactions that it wasn't a big deal...

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented May 16, 2017

Thanks so much for reviewing dude, really appreciate your feedback :)

I also found another flaw (a pretty big one!) in that while it won't send Class D before the feature is activated, it'll still parse them as valid...

D'oh!

Adding a check now to reject Class D during parsing before activation...

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented May 16, 2017

I'm still kind of leaning towards doing it anyway...

Me too! We'd be the first meta-protocol with SPV-ish features. :p

Currently the variable length integer encoding is poor for test ecosystem properties, but I figured since they form less than 0.1% of our transactions that it wasn't a big deal...

That's also a good point. We might consider it in the future though, maybe when/if we overhaul other parts (e.g. removing text fields).

@msgilligan

This comment has been minimized.

Copy link
Member

msgilligan commented May 20, 2017

Just noticing that I was tagged in this PR...

Looks like a lot of reading -- will try to catch up this weekend. Sorry for being slow to respond.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 30, 2018

Given this PR is outdated and needs further review, it's going to be closed for now.

@dexX7 dexX7 closed this Jan 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment