Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

01_scrape: Fix: Do not rely on existence of parent id #1

Closed
wants to merge 1 commit into from

Conversation

intrigus-lgtm
Copy link
Contributor

Problem

ctfd2pages works fine on https://ctf2022.maplebacon.org/challenges, because the parent of .challenge-button has an unique id.

<div id="1093826663" class="col-md-3 d-inline-block">
  <button class="[...] challenge-button [...]" value="7">
    [...]
  </button>
</div>

I tried to archive the CTF of my team (https://ctf.kitctf.me/challenges) and it failed, because our HTML has this structure.

<div class="col-sm-6 col-md-4 col-lg-3">
  <button class="challenge-button [...]" value="9">
    [...]
  </button>
</div>

I know that CTFd should in theory always add this parent id (https://github.com/CTFd/CTFd/blob/3c299095cb501a30a05e1e7b96c582f4724cedf2/CTFd/themes/core/assets/js/pages/challenges.js#L320) but I think that our custom theme interfered with this in some way.

"Solution"

Try to match the value=id of .challenge-button which is (in theory) also always added (https://github.com/CTFd/CTFd/blob/3c299095cb501a30a05e1e7b96c582f4724cedf2/CTFd/themes/core/assets/js/pages/challenges.js#L325-L334)

Feel free to close/ignore this PR if this is not a change you want to do.

@reteps
Copy link
Member

reteps commented Dec 11, 2022

Perhaps a more long-term solution (and robust against CTFd theme changes) would be to use the CTFd API instead of puppeteer. I think this change is good though, perhaps using this as a fallback would be good.

@zhuyifei1999
Copy link
Member

Perhaps a more long-term solution (and robust against CTFd theme changes) would be to use the CTFd API instead of puppeteer. I think this change is good though, perhaps using this as a fallback would be good.

I don't think CTFd API gives you the theming and all the templates being rendered, right? I mean, there are other CTFd archival tools out there, but they just archive the challenge with descriptions and handouts, rather than have a website so people can see what the CTF looked like while it was ongoing. And I think if we are to rely on the API alone, that's what we'll end up getting, just JSONs of data instead of renders.

Though, here, if a web page accesses an API end point, that API result is archived too, such as: https://github.com/sigpwny/2022.uiuc.tf/tree/main/api/v1

@zhuyifei1999
Copy link
Member

I checked if 'value' exists on the chals of UIUCTF. While it exists for UIUCTF 2020 and 2021, it doesn't work on 2022 where the HTML is like:

<div id="-1090740043" class="col-lg-4 col-md-6 d-inline-block black-magic tag-rev tag-windows">
  <div class="challenge-button mb-5 mx-5" id="198">
    <div style="background: url(&quot;/themes/core/static/img/challenge-art/rejecttoinject.png&quot;); background-size: cover;" class="challenge-spotlight-wrapper aspect-ratio-1-1">
[...]
    </div>
    <div class="btn btn-dark w-100 text-truncate pt-3 pb-3 mb-2 mt-2 position-relative">
[...]
    </div>
  </div>
</div>

Need something to somehow hold a strong reference to the elements I think rather than load it by value or ID.

@reteps
Copy link
Member

reteps commented Dec 12, 2022

Perhaps a more long-term solution (and robust against CTFd theme changes) would be to use the CTFd API instead of puppeteer. I think this change is good though, perhaps using this as a fallback would be good.

I don't think CTFd API gives you the theming and all the templates being rendered, right? I mean, there are other CTFd archival tools out there, but they just archive the challenge with descriptions and handouts, rather than have a website so people can see what the CTF looked like while it was ongoing. And I think if we are to rely on the API alone, that's what we'll end up getting, just JSONs of data instead of renders.

Though, here, if a web page accesses an API end point, that API result is archived too, such as: https://github.com/sigpwny/2022.uiuc.tf/tree/main/api/v1

Ah sorry, I misread this stage and thought it was mainly for downloading handout files.

@zhuyifei1999
Copy link
Member

zhuyifei1999 commented Dec 12, 2022

Apparently puppeteer has this Page.$ API that I never realized existed:

diff --git a/01_scrape/index.js b/01_scrape/index.js
index d126a99..10a315b 100644
--- a/01_scrape/index.js
+++ b/01_scrape/index.js
@@ -209,30 +209,20 @@ class PageHandler {
   async handleSpecials(page) {
     if (this.pageUrl === `${this.parent.origin}`) {
       // Puppeteer headless don't fetch favicon
-      const favicon = await page.evaluate(() => {
-        return document.querySelector('link[rel*=\'icon\']').href;
-      });
+      const favicon = await page.$eval('link[rel*=\'icon\']',
+          (e) => e.href);
       this.parent.allHandouts.add(favicon);
     } else if (this.pageUrl === `${this.parent.origin}challenges`) {
-      const chals = await page.evaluate(() => {
-        return Array.from(document.querySelectorAll(
-            '.challenge-button')).map((e) => e.parentElement.id);
-      });
+      const chals = await page.$$('.challenge-button');
 
       for (const chal of chals) {
         // Challenge Tab
         this.browseCompleted = new HeartBeat();
-        await page.evaluate((chal) => {
-          document.getElementById(chal)
-              .querySelector('.challenge-button')
-              .click();
-        }, chal);
+        await chal.click();
         await this.browseCompleted.wait();
 
-        const handouts = await page.evaluate(() => {
-          return Array.from(document.querySelectorAll(
-              '.challenge-files a')).map((e) => e.href);
-        });
+        const handouts = await page.$$eval('.challenge-files a',
+            (l) => l.map((e) => e.href));
         for (const handout of handouts) {
           assert(handout.startsWith(this.parent.origin));
           this.parent.allHandouts.add(handout);
@@ -240,12 +230,12 @@ class PageHandler {
 
         // Solves Tab
         this.browseCompleted = new HeartBeat();
-        await page.evaluate(() => {
-          document.querySelector('.challenge-solves').click();
-        }, chal);
+        await (await page.$('.challenge-solves')).click();
         await this.browseCompleted.wait();
 
+        await sleep(500);
         await page.keyboard.press('Escape');
+        await sleep(500);
       }
     }
   }
@@ -261,9 +251,8 @@ class PageHandler {
 
     await this.browseCompleted.wait();
 
-    const links = await page.evaluate(() => {
-      return Array.from(document.querySelectorAll('a')).map((e) => e.href);
-    });
+    const links = await page.$$eval('a',
+        (l) => l.map((e) => e.href));
 
     for (let link of links) {
       if (!link.startsWith(this.parent.origin)) {

Wdyt? Apparently I also had to add some delay around the escape press or it'll hang for some reason. Animation?

@intrigus-lgtm
Copy link
Contributor Author

I now understand why it failed for our theme, we are using ctfd-core-beta and it has a different structure:
https://github.com/CTFd/core-beta/blob/main/templates/challenges.html#L45-L50

@zhuyifei1999
Copy link
Member

using ctfd-core-beta

Ah hmm... The hooks in 08_hook_challenge (https://github.com/sigpwny/ctfd2pages/blob/main/08_hook_challenge/webpack-index.js) might not work for you either. If you want you can check what functions to hook instead in the beta, or I can do it later.

In any case, does that patch above look good to you? If so I'll commit it.

@intrigus-lgtm
Copy link
Contributor Author

Your patch works very well.

And you are right, the hooks are not working.
I fixed it by hooking CTFd.fetch.
I don't think hooking CTFd.pages.challenge.submitChallenge (https://github.com/CTFd/CTFd.js/blob/main/pages/challenge.js#L36-L57)
would have been a better choice.

Notes

  1. I'm not really a JS developer so some stuff can likely be improved.
  2. I believe this does not handle hints at all (https://github.com/CTFd/CTFd.js/blob/main/pages/challenge.js#L60-L66), but our CTF did not have any hints.
diff --git a/08_hook_challenge/webpack-index.js b/08_hook_challenge/webpack-index.js
index f493755..edb20db 100644
--- a/08_hook_challenge/webpack-index.js
+++ b/08_hook_challenge/webpack-index.js
@@ -1,7 +1,6 @@
 /* global CTFd */
 
 const FLAGS = require('./flags.json');
-const md = CTFd.lib.markdown();
 
 const sha256sum = async (string) => {
   const utf8 = new TextEncoder().encode(string);
@@ -11,7 +10,12 @@ const sha256sum = async (string) => {
       .join('');
 };
 
-CTFd.api.post_challenge_attempt = async function(parameters, body) {
+if (CTFd.api !== undefined) {
+  CTFd.api.post_challenge_attempt = post_challenge_attempt_hooked;
+  CTFd.api.get_hint = get_hint_hooked;
+}
+
+const post_challenge_attempt_hooked = async function(parameters, body) {
   const {challenge_id: chalId, submission: flag} = body;
   const expectedSHA = FLAGS[chalId];
   const submittedSHA = await sha256sum(flag);
@@ -46,7 +50,8 @@ CTFd.api.post_challenge_attempt = async function(parameters, body) {
   }
 };
 
-CTFd.api.get_hint = async function(parameters) {
+const get_hint_hooked = async function(parameters) {
+  const md = CTFd.lib.markdown();
   const hintId = parameters.hintId;
 
   for (const hintOrig of CTFd._internal.challenge.data.hints) {
@@ -68,3 +73,15 @@ CTFd.api.get_hint = async function(parameters) {
 
   throw new Error('Hint not found');
 };
+
+if (CTFd.fetch !== undefined) {
+  const originalCTFdFetch = CTFd.fetch;
+  CTFd.fetch = async function(url, options) {
+    if (/\/api\/v1\/challenges\/attempt/.test(url) && options.method === 'POST' && options.body !== undefined) {
+      const jsonResponse = await post_challenge_attempt_hooked({}, JSON.parse(options.body));
+      return new Response(JSON.stringify(jsonResponse));
+    } else {
+      return await originalCTFdFetch(url, options);
+    }
+  };
+}

@zhuyifei1999
Copy link
Member

Your patch works very well.

Thanks for testing! I'll merge it in a sec.

I don't think hooking CTFd.pages.challenge.submitChallenge (https://github.com/CTFd/CTFd.js/blob/main/pages/challenge.js#L36-L57) would have been a better choice.

Could you clarify on why you think so? I see the block

  if (CTFd._functions.challenge.submitChallenge) {
    CTFd._functions.challenge.submitChallenge(challengeId, challengeValue);
    return;
  }

and it seems to me that it's almost like it's intended to be hooked there

@zhuyifei1999
Copy link
Member

Actually let me create an issue for this so I can close this PR.

@zhuyifei1999
Copy link
Member

zhuyifei1999 commented Dec 14, 2022

Fixed, thanks for reporting and the patch! Though I still want to hear your thoughts on the hooking, if you would. Let's follow up in #2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants