# Finals: Rev - Oneliner (676)

## Challenge
We are given a `oneliner` file, which is digustingly long (9.4Mb) and is just an elaborate flag checker made out of many `if else` statements

I figured that the proper way would be to actually write a parser to simplify the whole thing, but I was extremely lazy (and unproficient in dealing with ASTs) so I ended up taking a longer? and rather silly method.

`re` gives us the regex we need to find relevant indexes of our flag.

I also renamed it from a `.py` file to `.txt` file otherwise my linter will try and fail miserably at syntax highlihgting

In [1]:
import re

with open('oneliner.txt', 'r') as f:
    data = f.read()

So the idea is that we know the flag format is `grey{...}`. 
Checking for `data.index("grey[18] == ")` gives an error, so we know the flag is 18 characters long.

Hence all the relevant code must be within the indices where `grey[0] == 'g'` and `grey[17] == '}'`

In [3]:
i1 = data.index("grey[0] == 'g'")
i2 = data.index("grey[1] == 'r'")
i3 = data.index("grey[2] == 'e'")
i4 = data.index("grey[3] == 'y'")
i5 = data.index("grey[4] == '{'")
e1 = data.index("grey[17] == '}'")
starts = [i1,i2,i3,i4,i5]
for i in range(5):
    print(f"Character {i+1} is at index {starts[i]}")

print("Ending index:", e1)

Character 1 is at index 3556462
Character 2 is at index 5054729
Character 3 is at index 5054750
Character 4 is at index 5056863
Character 5 is at index 5244326
Ending index: 7207432


So we only need to look at the indices within character 5 and the last character. I added some buffer just for fun

In [5]:
relevant = data[5244326-15: 7207432+15]

Now for some professional flag guessing and elimination, using an extremely inefficient search method, we only have 3 options, and one of those is not like the others. 

Hence select `q` with extreme confidence

In [6]:
index_5 = [i for i in range(len(relevant)) if relevant.startswith('grey[5] == ', i)]
for i in index_5:
    print(relevant[i:i+15], i)

grey[5] == 'T') 36
grey[5] == 'F') 13550
grey[5] == 'q') 690257


Repeat the same method. Now `q` is a hard character to match after, so its definitely `u`

In [8]:
index_6 = [i for i in range(len(relevant)) if relevant.startswith('grey[6] == ', i)]
for i in index_6:
    print(relevant[i:i+15], i)
#select u

grey[6] == 'j') 57
grey[6] == 'h') 3304
grey[6] == 'n') 13085
grey[6] == 't') 13571
grey[6] == 'O') 48340
grey[6] == 'H') 51037
grey[6] == 'S') 125535
grey[6] == 'g') 688547
grey[6] == 'D') 689789
grey[6] == 'H') 690278
grey[6] == 'Y') 956587
grey[6] == 'u') 957191


Now the train of thought is, we have `qu`, the next letter has to be a vowel like `i,o,e,a`. However, looking at the index of `u` previously, we can cross out all options that come before that index, so we're left with options `u,i`. Because there's still like hundred thousand more characters to go through, I select `i` simply because its the largest index, allowing me to eliminate more options 

In [29]:
index_7 = [i for i in range(len(relevant)) if relevant.startswith('grey[7] == ', i)]
for i in index_7:
    print(relevant[i:i+15], i)
#select i

grey[7] == 'f') 78
grey[7] == 'C') 2506
grey[7] == 'R') 3325
grey[7] == 'a') 3737
grey[7] == 'e') 7263
grey[7] == 'W') 13106
grey[7] == 'M') 13592
grey[7] == 'E') 17874
grey[7] == 's') 29605
grey[7] == 'P') 34154
grey[7] == 'K') 48361
grey[7] == 'T') 49829
grey[7] == 'D') 50612
grey[7] == 'y') 51058
grey[7] == 's') 78531
grey[7] == 'u') 78948
grey[7] == 'V') 121925
grey[7] == 'J') 125556
grey[7] == 'P') 170224
grey[7] == 'v') 170625
grey[7] == 'd') 198012
grey[7] == 'G') 294955
grey[7] == 'm') 513275
grey[7] == 'K') 688568
grey[7] == 'L') 689367
grey[7] == 'p') 689810
grey[7] == 'p') 690299
grey[7] == 'H') 734082
grey[7] == 'a') 747946
grey[7] == 'j') 824589
grey[7] == 'k') 845586
grey[7] == 'j') 919665
grey[7] == 'X') 956608
grey[7] == 'V') 957212
grey[7] == 'e') 957631
grey[7] == 'w') 1240460
grey[7] == 'z') 1268712
grey[7] == 'i') 1276678


Now let's shorten the search space with the indices we have

In [10]:
shorten1 = relevant[1276678:]

Now, there's only 1 logical option here, because I largely doubt the first word is like `quibble` or any nonsense like that. So select `t`

In [12]:
index_8 = [i for i in range(len(shorten1)) if shorten1.startswith('grey[8] == ', i)]
for i in index_8:
    print(shorten1[i:i+15], i)
#select t

grey[8] == 'o') 21
grey[8] == 'D') 3526
grey[8] == 'b') 6824
grey[8] == 't') 306358


Now there's only one logical option here given our "known" (lol) prefix of `quit`, so we guess `e` and hope we have a full word

In [14]:
index_9 = [i for i in range(len(shorten1)) if shorten1.startswith('grey[9] == ', i)]
for i in index_9:
    print(shorten1[i:i+15], i)
#select e

grey[9] == 'b') 42
grey[9] == 'i') 380
grey[9] == 'c') 3547
grey[9] == 'D') 5429
grey[9] == 'Q') 6483
grey[9] == 'p') 6845
grey[9] == 'B') 10033
grey[9] == 'g') 96201
grey[9] == 'Q') 271987
grey[9] == 'n') 300291
grey[9] == 'h') 301612
grey[9] == 'E') 306379
grey[9] == 'g') 307032
grey[9] == 'z') 552617
grey[9] == 'O') 571360
grey[9] == 'z') 575247
grey[9] == 'e') 647477


Okay good job, let's shorten our search space more with extreme confidence

In [16]:
shorten2 = shorten1[306379:]
len(shorten2)

380079

Okay not bad, 380k more characters to go.

Now it's clear one of these characters are not like the others, could this be a word separator?
Select the `_` for confirmation bias of good decisions made so far (believe it or not, this was my actual train of thought)

In [17]:
index_10 = [i for i in range(len(shorten2)) if shorten2.startswith('grey[10] == ', i)]
for i in index_10:
    print(shorten2[i:i+15], i)
#select _

grey[10] == 'U' 21
grey[10] == 'J' 334
grey[10] == 'A' 674
grey[10] == 'd' 99970
grey[10] == 'd' 203249
grey[10] == 'B' 218666
grey[10] == 'P' 218963
grey[10] == 'm' 220124
grey[10] == 'e' 246259
grey[10] == 'M' 248918
grey[10] == 'd' 260153
grey[10] == 'd' 265002
grey[10] == 'g' 266180
grey[10] == 'Z' 267416
grey[10] == 'L' 268889
grey[10] == 'M' 269194
grey[10] == 'V' 286254
grey[10] == 'o' 287067
grey[10] == 'H' 289883
grey[10] == 'l' 340782
grey[10] == 'T' 341119
grey[10] == '_' 341426


Now we can cut down our search space even more, and wow, only 38k characters to go

In [19]:
shorten3 = shorten2[341426:]
len(shorten3)

38653

Wow that's pretty easy, so now our guess so far is `quite_b`

In [21]:
index_11 = [i for i in range(len(shorten3)) if shorten3.startswith('grey[11] == ', i)]
for i in index_11:
    print(shorten3[i:i+15], i)

grey[11] == 'b' 22


Hmm, pretty difficult decision here, it could be `o` or `i`. Trust my gut and take the option that narrows down the search space the most. Take `i`

In [22]:
index_12 = [i for i in range(len(shorten3)) if shorten3.startswith('grey[12] == ', i)]
for i in index_12:
    print(shorten3[i:i+15], i)

grey[12] == 'o' 44
grey[12] == 'g' 4669
grey[12] == 'k' 18146
grey[12] == 'w' 26317
grey[12] == 'i' 27655


There's been a clear trend of all lowercase characters so I immediately filter those options from my headspace as well as those that occur after the index of `i` (27655). So there are 2 realistic options, `bil` or `big` and I doubt the word is like `bill` or `bilateral` or idk what else it could be. Pray hard that the next index is a `_` to confirm suspicions

In [24]:
index_13 = [i for i in range(len(shorten3)) if shorten3.startswith('grey[13] == ', i)]
for i in index_13:
    print(shorten3[i:i+15], i)

grey[13] == 'd' 66
grey[13] == 'q' 559
grey[13] == 'v' 765
grey[13] == 'V' 1785
grey[13] == 'F' 4691
grey[13] == 'E' 6191
grey[13] == 'E' 6386
grey[13] == 'A' 14203
grey[13] == 'L' 15217
grey[13] == 'X' 15414
grey[13] == 'N' 18168
grey[13] == 'a' 19136
grey[13] == 'f' 21426
grey[13] == 'm' 21629
grey[13] == 'J' 25459
grey[13] == 'W' 26339
grey[13] == 'i' 26697
grey[13] == 'L' 27677
grey[13] == 'l' 27871
grey[13] == 'G' 28075
grey[13] == 'g' 31881


What do you know, we have a special character show up again, which seems to demarcate our words well. Also take note of the apparrent trend where the selected character is always the last possible option, almost as if once you bypass that area, there's no more need to check that index anymore....hmmm

In [26]:
index_14 = [i for i in range(len(shorten3)) if shorten3.startswith('grey[14] == ', i)]
for i in index_14:
    print(shorten3[i:i+15], i)

grey[14] == 'v' 88
grey[14] == 'k' 248
grey[14] == 'd' 581
grey[14] == 'N' 787
grey[14] == 'n' 1179
grey[14] == 's' 1807
grey[14] == 'b' 3019
grey[14] == 'l' 4128
grey[14] == 'Y' 4713
grey[14] == 'S' 4861
grey[14] == 'I' 5014
grey[14] == 'B' 5448
grey[14] == 'g' 6213
grey[14] == 'O' 6408
grey[14] == 'j' 11154
grey[14] == 'a' 11307
grey[14] == 'I' 13035
grey[14] == 'i' 13384
grey[14] == 'L' 14225
grey[14] == 'C' 14723
grey[14] == 's' 14893
grey[14] == 'B' 15041
grey[14] == 'K' 15239
grey[14] == 'd' 15436
grey[14] == 'Q' 16366
grey[14] == 'G' 16644
grey[14] == 't' 16814
grey[14] == 'p' 18190
grey[14] == 'R' 18652
grey[14] == 'T' 19158
grey[14] == 's' 19691
grey[14] == 'q' 19841
grey[14] == 'Z' 21448
grey[14] == 'c' 21651
grey[14] == 'o' 22328
grey[14] == 'X' 23222
grey[14] == 'i' 24546
grey[14] == 'c' 25481
grey[14] == 'f' 25640
grey[14] == 'H' 26123
grey[14] == 'G' 26361
grey[14] == 'f' 26522
grey[14] == 't' 26719
grey[14] == 'v' 27254
grey[14] == 'U' 27699
grey[14] == 'F' 27893
grey[14

Shorten the search space, we only have 2 characters left! And 2000 characters left, this is totally doable

In [29]:
shorten4 = shorten3[36514:]
len(shorten4)

2139

Okay this looks alright, lets see the last character
(Hold the thought of the last character theory...)

In [31]:
index_15 = [i for i in range(len(shorten4)) if shorten4.startswith('grey[15] == ', i)]
for i in index_15:
    print(shorten4[i:i+15], i)

grey[15] == 'w' 22
grey[15] == 'N' 1232
grey[15] == 'a' 1529


Well, is this it? We only have one character left at the back which is lowercase as well. The hypothesis of last possible character being the correct one has been  proven

In [33]:
index_16 = [i for i in range(len(shorten4)) if shorten4.startswith('grey[16] == ', i)]
for i in index_16:
    print(shorten4[i:i+15], i)

grey[16] == 'M' 44
grey[16] == 'Q' 131
grey[16] == 'O' 269
grey[16] == 'N' 532
grey[16] == 'K' 1027
grey[16] == 'L' 1254
grey[16] == 'D' 1551
grey[16] == 'h' 1771


Okay, let's take stock of all we have
`grey{quite_big_ah}`. Seems right! That was pretty easy

#### Closing Notes
Haiz I wish it was so simple.... I made a fatal mistake during the CTF.  
Instead of doing `shorten4 = shorten3[36514:]`, I did `shorten4 = shorten3[:36514]` which gave me a whole bogus set of characters that I subsequently used to brute force the flag (using the extremely slow flag checker). It had a whole bunch of characters except the ones I needed

I even went to the extent of doubting my own admittedly guessy first steps, and even started considering upper case characters. I even went to the CSW Scrabble Word List, for a list of valid 2 letter words, list of valid 5 letter words, used `itertools.product` of possible letters, only considering if the `word.upper()` was in the list. Left that running overnight, but my critical aforementioned mistake led to me missing the necessary alphabet to brute force correctly....

My guess in the end was `grey{quite_big_no}` which was close but not close enough. Quite a fun challenge to attempt and now I will learn better flag guessing skills for the next CTF