Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex phrasing throws error #54

Closed
ndm13 opened this issue Mar 4, 2015 · 4 comments
Closed

Regex phrasing throws error #54

ndm13 opened this issue Mar 4, 2015 · 4 comments

Comments

@ndm13
Copy link

ndm13 commented Mar 4, 2015

I think it's been mentioned before, but I can't find any specific documentation on it. Scallion throws an error when using ranges (i.e. scallion [a-d]) or repeating patterns (i.e. scallion [abcd]{3}). This makes it annoying to find patterned addresses. For instance, scallion [abcd]{3} needs to be written scallion [abcd][abcd][abcd] to be considered valid.

Either a note in the documentation explaining this or an update to the parser would be appreciated.

@ndm13
Copy link
Author

ndm13 commented Mar 4, 2015

Also, it seems like long regex strings cause crashes. scallion [bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz] runs just fine, yet scallion [bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz][aeiou] compiles, runs a hash check, and crashes. Anything longer than that just hangs at Compiling. If there's a length limit, I'm not sure where to find it.

Output of the latter:

Cooking up some delicions scallions...
Using kernel optimized from file kernel.cl (Optimized4)
Using work group size 128
Compiling kernel... done.
Testing SHA1 hash...
CPU SHA-1: d3486ae9136e7856bc42212385ea797094475802
GPU SHA-1: d3486ae9136e7856bc42212385ea797094475802
Looks good!
0x07452B91 (0x06346A00 0x077457D0 0x077457C4 0xD4D03B92) <unknown module>
0x07452B91 (0x06346A00 0x0023E158 0x0023E14C 0xD3878D0A) <unknown module>
0x5E06A0D0 (0x06346A00 0x077457D0 0x077457C4 0x00000000)0x5E06A0D0 (0x06346
A00 0x0023E158 0x0023E14C 0x00010009)

0x64D6DAC3 (0x00010009 0x00000000 0x00000000 0x0053A618)0x64D6DAC3 (0x00000
000 0x00000000 0x00000000 0x0053A618)

0x671EE1F2 (0x0023E200 0xD3814762 0x04E13E80 0x04E07920)0x671EE1F2 (0x07745
878 0xD4D6FD9A 0x04E03E80 0x04E07D20), ?GetTaskExecutor@TaskExecutor@OpenCL
@Intel@@YAPAVITaskExecutor@123@XZ() + 0x7AF2 bytes(s)
0x671E85FF (0x04E07920 0x04E07924 0x705B9EED 0xD38117AD), ?GetTaskExecutor@
TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() + 0x1EFF bytes(s)
0x671EE9FC (0x04E1F720 0x04E07920 0x705B6A0F 0xD3811639), ?GetTaskExecutor@
TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() + 0x82FC bytes(s)
0x671E52D0 (0x7760E0F2 0x00500840 0x00000070 0x00000000)
, ?GetTaskExecutor@TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() +
 0x7AF2 bytes(s)
0x7760E38C (0x00500840 0x00000070 0x00000000 0x0023E360)0x671E85FF (0x04E07
D20 0x04E07D24 0x705B9EED 0xD4D6AC35), ?GetTaskExecutor@TaskExecutor@OpenCL
@Intel@@YAPAVITaskExecutor@123@XZ() + 0x1EFF bytes(s)
0x671EE9FC (0x04E06F20 0x04E07D20 0xD4D6AC21 0x07745AA8), ?GetTaskExecutor@
TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() + 0x82FC bytes(s)
0x705B9760 (0x04E07D20 0x04E07D1C 0xD4D6FFB6 0x00000000), ?internal_wait@ta
sk_arena_base@internal@interface7@tbb@@IBEXXZ() + 0x2B20 bytes(s)
, RtlInitUnicodeString() + 0x164 bytes(s)
0x7760E0F2 (0x0023E398 0x06DBB5F0 0x0023E388 0x75350DBB), RtlAllocateHeap()
 + 0xAC bytes(s)
0x75350DA1 (0x00000000 0x05A8C708 0x00500840 0x0023E3B0), CreateEventExW()
+ 0x6E bytes(s)
0x75350DBB (0x00000000 0xD38146FA 0x00500860 0x00500874), CreateEventExW()
+ 0x88 bytes(s)
0x671E6629 (0x671E5F03 0xD3814122 0x00500840 0x00500840)
0x671E5EF5 (0x0023E450 0x64D7BD37 0x06DBB648 0x64D73951)
0x671E59FE (0xD3812530 0x05E5FFA0 0x058571A0 0x00000000)
0x64D73951 (0x07A8ABA8 0x6682A6DB 0xD3814048 0x06DAD804), clDevInitDeviceAg
ent() + 0x581 bytes(s)
0x6684DB57 (0x06DAD818 0x0023E4D4 0x66834102 0x6683411E), clWaitForEvents()
 + 0x70A47 bytes(s)
0x6682B42B (0x6683411E 0xD3814014 0xD38141E6 0x00000002), clWaitForEvents()
 + 0x4E31B bytes(s)
0x66834102 (0xD38141A4 0x06DAD790 0x06DAD800 0x058571A4), clWaitForEvents()
 + 0x56FF2 bytes(s)
0x6683465B (0x07A8AA98 0x02CF00EC 0x005575E3 0x66862590), clWaitForEvents()
 + 0x5754B bytes(s)
0x7760E0F2 (0xFFEEFFEE 0x00000000 0x05210010 0x004E00A8), RtlAllocateHeap()
 + 0xAC bytes(s)
0x01003310 (0x00000000 0x05210010 0x004E00A8 0x004E0000) <unknown module>
0xFFEEFFEE (0x05210010 0x004E00A8 0x004E0000 0x004E0000) <unknown module>
0xFFEEFFEE (0x05210010 0x004E00A8 0x004E0000 0x004E0000) <unknown module>

@richardklafter
Copy link

Scallion can not run the regexes on the GPU. It converts them to a list of strings in https://github.com/lachesis/scallion/blob/gpg/scallion/RegexPattern.cs. We simply did not implement ranges or repeating patterns. If your decent at C# it would be awesome if ya added those features.

The other issue you mentioned is a practical bug. Scallion converts a regex to list of strings. If these strings are short (less then 7 or 8 characters) the strings are checked on the GPU. If the strings are long they are checked on the CPU.

In your last example, [bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz][aeiou], all the generated strings are short and are all checked on the GPU. Further, the set of generated strings is huge:
"bcdfghjklmnpqrstvwxyz".length*"aeiou".length*"aeiou".length*"bcdfghjklmnpqrstvwxyz".length*"aeiou".length = 55125 possibilities

Try the pattern aaaaaaaa[bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz][aeiou] which will should run. This example only produces a single string that is checked on the GPU.

In summary, Scallion was written to find longer patterns. The regex support was added to allow you to search for variations and increase your odds of finding a collision. If you are looking for a shorter pattern don't use regexes because scallion can easily find a short collision.

That being said, scallion really should print out a warning if its going to try to check 55125 patterns on the GPU :P

@ndm13
Copy link
Author

ndm13 commented Mar 4, 2015

Hey, thanks a lot. I was looking to get either alphabetic or word-like results, but it seems that this isn't feasible without running many threads because the string list would be massive! I wasn't aware of this, thanks for clarifying. C# isn't something I'm experienced in, so sorry I can't help out.

@ndm13 ndm13 closed this as completed Mar 4, 2015
@lachesis
Copy link
Owner

lachesis commented Mar 5, 2015

For your use case, if you only goal is to get a prefix which follows the pattern "consonant vowel vowel consonant vowel", you're probably better off using shallot, or using scallion with a shorter pattern and the "-c" option and then filtering afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants