New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cryptocondtions (RIPD-1139) #2170

Open
wants to merge 15 commits into
base: develop
from

Conversation

Projects
None yet
6 participants
@seelabs
Contributor

seelabs commented Jul 13, 2017

This PR is not as large as it first looks. There is a large amount of
automatically generated code that is used for tests. Reviewing that code is not
necessary (reviewing the script that generates the tests, and the code output for
a single test should be enough).

Helpful references:
Cryptoconditions spec: https://tools.ietf.org/html/draft-thomas-crypto-conditions-02
ASN.1 spec: https://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf
Informal doc on ASN.1: http://luca.ntop.org/Teaching/Appunti/asn1.html
Web site to encode/decode ASN.1: http://asn1-playground.oss.com/

Note:

The max cryptoconditions size is larger than I'd like. This is because Rsa CCs
are relatively large, and without a large limit I can't write tests for
thresholds CCs that contain RsaSha256 CCs.

@seelabs seelabs requested review from mellery451 and ximinez Jul 13, 2017

@seelabs seelabs force-pushed the seelabs:der branch from b9176cb to 85108ca Jul 13, 2017

@MarkusTeufelberger

This comment has been minimized.

Contributor

MarkusTeufelberger commented Jul 13, 2017

Would it be possible to generate this code at build time instead of committing it, similar to protobuf code?

@seelabs

This comment has been minimized.

Contributor

seelabs commented Jul 13, 2017

@MarkusTeufelberger Yes, it would be possible. My biggest concern is the script requires several python libraries and python 3 (python 2 support may be easy). It's also useful (tho not required) to run clang-format over the generated code or it will be very difficult to see what's going on. Making sure these deps are installed on all systems that want to build rippled is a headache.

It also complicates the build scripts. Tho that's less of a concern.

@MarkusTeufelberger

This comment has been minimized.

Contributor

MarkusTeufelberger commented Jul 13, 2017

If it is not intended for humans, why format it?

You'll anyways likely need a build step that verifies that your version of the generated code is still identical to what would be generated in case the generation script changes, so you'll anyways somehow need to encode the code generation in a build script.

Which commit contains the actual script? Skipping through all of them seem to be relatively large, so I suspect these are the ones containing generated code...

@seelabs

This comment has been minimized.

Contributor

seelabs commented Jul 13, 2017

@MarkusTeufelberger The [fold] Generated tests for cryptoconditions contains the script (look for the python file). The main file is: https://github.com/ripple/rippled/blob/ec972c4bc2e5450b3162819b1be42eb5a64e93ee/bin/crypto_conditions_test_gen/conditions.py (Edit to give better link to main python file).

@MarkusTeufelberger

This comment has been minimized.

Contributor

MarkusTeufelberger commented Jul 13, 2017

Thanks, I'll take a look. It would be great if you could also check which exact library versions you installed using conda.

@seelabs

This comment has been minimized.

Contributor

seelabs commented Jul 13, 2017

@MarkusTeufelberger I'm using the following versions:
pyasn1: 0.2.3
cryptography: 1.8.1
pynacl: 1.1.2

}
size_t
contentLengthLength(std::uint64_t v)

This comment has been minimized.

@mellery451

mellery451 Jul 13, 2017

Contributor

I get an error here from clang (mac):

src/ripple/conditions/impl/Der.cpp|175 col 1 error| functions that differ only in their return type cannot be overloaded
|| contentLengthLength(std::uint64_t v)
|| ^
src/ripple/conditions/impl/Der.h|521 col 1| note: previous declaration is here
|| contentLengthLength(std::uint64_t);
|| ^

This comment has been minimized.

@seelabs

seelabs Jul 13, 2017

Contributor

@mellery451 Thanks. I just pushed a patch that should fix this (changed decl return type from size_t to uint64_t.

@codecov-io

This comment has been minimized.

codecov-io commented Jul 13, 2017

Codecov Report

Merging #2170 into develop will increase coverage by 0.46%.
The diff coverage is 84.92%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2170      +/-   ##
===========================================
+ Coverage    70.05%   70.51%   +0.46%     
===========================================
  Files          689      699      +10     
  Lines        50724    52046    +1322     
===========================================
+ Hits         35533    36702    +1169     
- Misses       15191    15344     +153
Impacted Files Coverage Δ
src/ripple/conditions/Fulfillment.h 100% <100%> (ø) ⬆️
src/ripple/conditions/impl/PreimageSha256.h 100% <100%> (+21.62%) ⬆️
src/ripple/conditions/Condition.h 100% <100%> (ø) ⬆️
src/ripple/app/tx/impl/Escrow.cpp 92.26% <100%> (+1.03%) ⬆️
src/ripple/conditions/impl/PrefixSha256.h 100% <100%> (ø)
src/ripple/conditions/impl/ThresholdSha256.h 100% <100%> (ø)
src/ripple/conditions/impl/RsaSha256.h 100% <100%> (ø)
src/ripple/basics/Slice.h 98.11% <100%> (+0.67%) ⬆️
src/ripple/conditions/impl/Ed25519.h 100% <100%> (ø)
src/ripple/conditions/impl/Ed25519.cpp 100% <100%> (ø)
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 589570d...d259692. Read the comment docs.

raise ValueError('Can only set bits between 0 and 31')
result[b] = 1
# reverse for big endian
return tuple(result)

This comment has been minimized.

@mellery451

mellery451 Jul 13, 2017

Contributor

hmm - I feel like one of these return statements should be removed.

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

fixed

def set_msg_self_and_children(self, msg):
'''
Set the message of a fulfillment so it suceeds.

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

minor: suceeds --> succeeds

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

fixed

'''
Set the message of a fulfillment so it suceeds.
This will set the message of child fulfillments so they
will suceeds as well, in particular, this means prefix subcondition

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

minor: suceeds --> succeed

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

fixed

@@ -0,0 +1,1217 @@
#!/usr/bin/env python

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

could we have a one-line description of what this script is for and if/when it should be executed?

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

fixed

@@ -333,6 +333,7 @@ class PreimageSha256_test : public beast::unit_test::suite
log << "Fulfillment deserialize error: " << std::get<0>(x) << " " << ec.message() << '\n';
}
auto const c = Condition::deserialize (hexblob(std::get<2>(x)), ec);

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

I'm confused about c here - do we care about the value? Should we verify correctness for it , or just ignore it?

This comment has been minimized.

@seelabs

seelabs Jul 14, 2017

Contributor

The condition here does not match the fulfillment. They were just dummy values used to test the parser (if you look at the fingerprint in the condition, it's 4849505152535455484950515253545548, which is clearly not a valid fingerprint.

The history is this test predated the generated tests (which do check that the conditions match the fulfillments), and I decided to keep the old tests. Having said all that, I do agree this should be improved. Now that I have python scripts, I'll fix the conditions so they match the fulfillments as expected and change the test. A second option would be to just remove this file (as the generated tests are better).

condition_logger = logging.getLogger('condition')
condition_test_template_prefix = \
'''

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

Should we add a comment at the top indicating that the file was generated and "DO NO EDIT"? Or are we okay with people editing the generated sources?

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

fixed

length and sort order so they do not need to be recomputed. Note that storing
values in the cache is type dependent, and the address of the variable must be
stable while encoding. It makes sense to cache higher level values, but not
primitives.

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

this doc is very clear and helpful 👍

{
test(i->first, i->second);
using Traits = cryptoconditions::der::DerCoderTraits<std::string>;
this->BEAST_EXPECT(Traits::compare(i->first, i->first, dummy) == 0);

This comment has been minimized.

@mellery451

mellery451 Jul 14, 2017

Contributor

any particular reason for this->BEAST_EXPECT(... vs. just BEAST_EXPECT(...? It doesn't bother me either way, just wondering.

This comment has been minimized.

@seelabs

seelabs Jul 17, 2017

Contributor

The this is needed in a lambda. I've forgotten the history of this code, but it's likely it used to live in a lambda and when I moved it outside the lambda I forgot to remove the this.

Will fix.

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

fixed

@seelabs

This comment has been minimized.

Contributor

seelabs commented Jul 17, 2017

Pushed a commit to support clang's fuzzing sanitizer (requires clang 5)

@MarkusTeufelberger

This comment has been minimized.

Contributor

MarkusTeufelberger commented Jul 17, 2017

Great idea using libFuzzer for this. 👍

add_executable(fulfillment_fuzzer ${FuzzerSrc}
${CMAKE_SOURCE_DIR}/src/fuzzers/Conditions_fuzz_test.cpp)
target_link_libraries(fulfillment_fuzzer ${OPENSSL_LIBRARIES} ${SANITIZER_LIBRARIES})

This comment has been minimized.

@mellery451

mellery451 Jul 17, 2017

Contributor

it looks to me like SANITIZER_LIBRARIES is no longer being set, so should we just get rid of it ?

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

Fixed (removed SANITIZER_LIBRARIES). Although I might bring that back for gcc builds if there are complaints (clang doesn't need it).

-I"${CMAKE_SOURCE_DIR}/"src/ed25519-donna)
add_executable(fulfillment_fuzzer ${FuzzerSrc}
${CMAKE_SOURCE_DIR}/src/fuzzers/Conditions_fuzz_test.cpp)

This comment has been minimized.

@mellery451

mellery451 Jul 17, 2017

Contributor

could Conditions_fuzz_test.cpp just be added to FuzzerSrc ?

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

Fixed (and an explanation - I originally had two separate "main" functions, one for testing fulfullments and one for conditions. When I merged them into a single file I should have also fixed the build script).

@@ -447,16 +447,18 @@ if (WIN32 OR is_xcode)
group_sources(Builds)
endif()
if(unity)
if (NOT fuzzer_conditions)

This comment has been minimized.

@mellery451

mellery451 Jul 17, 2017

Contributor

I don't have a strong opinion, but why not just allow rippled to be a target? You can still just build the fuzzed exes by name if you don't want to build rippled e.g. make fulfillment_fuzzer.

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

Fixed

add_executable(condition_fuzzer ${FuzzerSrc}
${CMAKE_SOURCE_DIR}/src/fuzzers/Conditions_fuzz_test.cpp)
target_link_libraries(condition_fuzzer ${OPENSSL_LIBRARIES} ${SANITIZER_LIBRARIES})

This comment has been minimized.

@mellery451

mellery451 Jul 17, 2017

Contributor

I have not used libFuzzer myself, but docs indicate that you still need to link with libFuzzer.a - so we might need to add that here ?

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

It's building fine for me without it. Clang links in the correct lib if you link with -fsanitize=fuzzer,address. (The docs for AddressSanitizer say this explicitly: https://clang.llvm.org/docs/AddressSanitizer.html). You're right, I can't find the same snippet with the FuzzerSanitizer, but it is acting the same way as AddressSanitizer.

My motivation for removing the explicit link is I was getting errors about linking in incompatible sanitizer libraries. The easiest way for me to resolve this was to let clang choose the right library for me.

@@ -518,6 +520,40 @@ foreach(target IN LISTS targets)
link_common_libraries(${target})
endforeach()
if (fuzzer_conditions)

This comment has been minimized.

@mellery451

mellery451 Jul 17, 2017

Contributor

I don't think there is much need for a user-defined variable for this since you can just set it based on the san values, something like: mellery451@a303c0f

This comment has been minimized.

@seelabs

seelabs Jul 18, 2017

Contributor

I'd like to keep this variable. If we add other fuzzers I think we want to control which fuzzers we are building. I don't mind the slightly clunkier build step for this because it's an uncommon build target.

@mellery451

Impressive stuff with the serialization design and testing. I wish there were a standard set of test data for DER encoders that we could also run through the ser/deser classes, but I could not find anything. I think the fuzzing is great as well.

As for the code generation thing, here's a commit that does this: mellery451@b19911d . I've tested on mac/linux and it does the basic job. I would not be surprised if there are fixes required for windows (esp. the python commands). It's cmake only, so we should only consider it once we finally drop scons.

@seelabs seelabs force-pushed the seelabs:der branch from 5f05ba1 to d723cf0 Jul 21, 2017

@seelabs

This comment has been minimized.

Contributor

seelabs commented Jul 21, 2017

rebased onto 0.80.0-b3

@seelabs seelabs force-pushed the seelabs:der branch from d723cf0 to 8919dfc Aug 1, 2017

@seelabs

This comment has been minimized.

Contributor

seelabs commented Aug 1, 2017

Rebased onto 0.80.0-b4

@ximinez

I FINALLY FINISHED!

BEAST_EXPECT(*s.data() == 42);
s = ms; // can assign an immutable slice from a mutable slice
auto const svs = makeSlice(mutValues); // can create a slice from a boost small vector
BEAST_EXPECT(*svs.data() == 42);

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

How about a call to the new push_back?

ms.push_back(37);
BEAST_EXPECT(mutValues[0] == 37);
BEAST_EXPECT(*ms.data() == 1);

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

Fixed in a028ad7 [fold] Responding to Ed PR feedback 1

void
encodeContentLength(MutableSlice& dst, std::uint64_t v, std::error_code& ec);
/** return the number of bytes required to encode a the given content length

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

Nit: Extra article a the.

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

Fixed in a028ad7 [fold] Responding to Ed PR feedback 1

are added to the stream using the `<<` operator. After all the values are
added to the encoder, it must be terminated with a call to `eos()`. As a
convenience, there is a special variable called `eos` that when streamed will
call the streams `eos()` function. Typically, the code to encode values to a

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

Nit: stream's

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

Fixed in a028ad7 [fold] Responding to Ed PR feedback 1

The decode class has an interface similar to a c++ output stream. Values are
decoded from the stream using the `>>` operator. After all the values are
decoded, it must be terminated with a call to `eos()`. As a convenience, there
is a special variable called `eos` that when streamed will call the streams

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

Nit again: stream's

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

Fixed in a028ad7 [fold] Responding to Ed PR feedback 1

if (!isSigned && slice.size() == sizeof(T) + 1 && slice[0])
{
// since integers are coded as two's complement, the first byte may
// be zero for unsigned reps

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

must be zero???

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

No, it may be zero, but it is not necessarily zero. For example, the unsigned value one is encoded as 0x1. If the leading 8th bit of a positive value is 1, we need to add a leading 0 to distinguish it from a negative value.

auto const& e = std::get<index>(elements);
using ElementTraits = DerCoderTraits<std::decay_t<decltype(e)>>;
// visual studio can't handle childNum = index, use decltype

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

Since you updated the type of index without updating any other code are these last 2 comments still true?

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

Fixed in a028ad7 [fold] Responding to Ed PR feedback 1

@@ -511,20 +511,28 @@ endmacro()
macro(setup_build_boilerplate)

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

The commit message is marked fold, but please don't lose the instructions for the fuzzer when this is squashed.

@@ -511,20 +511,28 @@ endmacro()
macro(setup_build_boilerplate)
if (NOT WIN32 AND san)
add_compile_options(-fsanitize=${san} -fno-omit-frame-pointer)
STRING(REPLACE ";" "," comma_san "${san}")
add_compile_options(-fsanitize=${comma_san} -fno-omit-frame-pointer)

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

Is there a reason we can't spec to pass a comma on the command line instead of the semicolon?

This comment has been minimized.

@seelabs

seelabs Sep 8, 2017

Contributor

Yes. There are places where we want to treat ${san} like a list and iterate through its individual values - so we need a semi-colon. There are other places where we need to separate them with a comma because that's how the compiler wants them. So if we passed them with a comma on the command line we'd have to replace that comma with a semi-colon in other places.

static int fileNum = 0;
std::ofstream ofs;
char fileName[32];
sprintf(fileName, "logic_error%d.dat", fileNum++);

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

If the intention here is to prevent overwriting an existing file, I would suggest a loop to check that the file exists, and incrementing again if it does.

This comment has been minimized.

@seelabs

seelabs Aug 4, 2017

Contributor

Fixed in a028ad7 [fold] Responding to Ed PR feedback 1

@@ -67,6 +84,9 @@
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/
// THIS FILE WAS AUTOMATICALLY GENERATED -- DO NOT EDIT

This comment has been minimized.

@ximinez

ximinez Aug 1, 2017

Contributor

This comment is not in the generated test files. I assume that means you need to regenerate them. (See also my earlier comment about scripting the generation.)

@seelabs

This comment has been minimized.

Contributor

seelabs commented Aug 4, 2017

@ximinez I pushed a commit to address most of your feedback and marked the ones I addressed. However I still have a few more I intend on addressing.

@ximinez

Other than the docs build problem, these changes look great. I look forward to the next round of changes.

@@ -124,6 +124,7 @@ INPUT = \
../src/ripple/app/tx/applySteps.h \
../src/ripple/app/tx/impl/InvariantCheck.h \
../src/ripple/app/consensus/RCLValidations.h \
../src/ripple/conditions/impl/Der.h \

This comment has been minimized.

@ximinez

ximinez Aug 18, 2017

Contributor

As noted elsewhere, the docs fail to build now.

/** encode the contents used to calculate a fingerprint
@note Most cryptoconditions (excepting preimage) calculate their
fingerprints by encoding into a ans.1 der format and hashing the

This comment has been minimized.

@nbougalis

nbougalis Aug 25, 2017

Member

"ASN.1" and "DER"

This comment has been minimized.

@seelabs

seelabs Sep 8, 2017

Contributor

Fixed in ad0de5e [fold] Address feedback from review:

Type t,
std::uint32_t c,
std::array<std::uint8_t, 32> const& fp,
std::bitset<5> const& s = std::bitset<5>{})

This comment has been minimized.

@nbougalis

nbougalis Aug 25, 2017

Member

Why not simply std::bitset<5> s = {}

This comment has been minimized.

@seelabs

seelabs Sep 8, 2017

Contributor

Fixed in ad0de5e [fold] Address feedback from review:

@seelabs seelabs force-pushed the seelabs:der branch from a028ad7 to d259692 Aug 31, 2017

seelabs added some commits Jun 9, 2017

Add MutableSlice class:
* Add a class for a mutable linear buffer
* Add support to make a slice from a `boost::container::small_vector`
[fold] Support for fuzzers:
* Note: requires clang's fuzzer sanitizer (included in clang 5)
* Python script to create a fuzz corpus
* Allow python script to run outside ipython
* Sample cmake:
  cmake -Dfuzzer_conditions=ON -Dsan='fuzzer;address' -Dtarget=clang.debug ..
* Sample Run:
  ./fulfillment_fuzzer <path_to_corpus_dir>  -max_len=5000 -jobs=3
* There are separate executables for fulfillment and conditions.

@seelabs seelabs force-pushed the seelabs:der branch from d259692 to b7a405a Sep 8, 2017

@seelabs

This comment has been minimized.

Contributor

seelabs commented Sep 8, 2017

I pushed another set of changes. I still have two issues left to address: the failing doc generation and dealing with the automatically generated cpp test files.

[fold] Address feedback from review:
* Split Der.{h,cpp} into multiple files and reord includes
* rename asn.1 -> ASN.1; der -> DER
* std::bitset<5> s = {}
* Add comment justifying Ed25519 cost magic number

@seelabs seelabs force-pushed the seelabs:der branch from b7a405a to ad0de5e Sep 8, 2017

@ximinez

This comment has been minimized.

Contributor

ximinez commented Oct 11, 2017

There is some good news about the doc generation. Since you split Der.h into three files, and doxygen doesn't follow includes, the docs build now. The bad news is that they don't pick up any of the DER code. :)

But there's more good news! Because I could add those three files to source.dox one at a time, I was able to narrow the problem down to DerPrimitiveTraits.h. I still have no idea what the actual failure is, but at least it's narrowed down slightly.

@seelabs

This comment has been minimized.

Contributor

seelabs commented Nov 30, 2017

I tracked down the documentation errors. There are three places that confuse document generation. The problem cases are:
1)

template <std::size_t S>
struct DerCoderTraits<boost::container::small_vector<std::uint8_t, S>> : OctetStringTraits
template <std::size_t S>
struct DerCoderTraits<boost::container::small_vector<std::uint8_t, S>> : OctetStringTraits
template <class... Ts>
struct DerCoderTraits<std::tuple<Ts&...>>

I tried some workarounds with #if !GENERATING_DOCS and I can get them to generate docs by changing the declaration. However, it makes the code much harder to read. The benefit from the docs is not worth the new difficulty in reading the code.

My solution is to not generate docs for all three cases (I wrapped the entire definition in #if !GENERATING_DOCS. I am going to do resolve some other issues in the PR before I push the commit that fixes the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment