Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine how we can verify a page was migrated successfully #36

Closed
2 tasks done
zstix opened this issue Aug 26, 2020 · 2 comments
Closed
2 tasks done

Determine how we can verify a page was migrated successfully #36

zstix opened this issue Aug 26, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@zstix
Copy link
Contributor

zstix commented Aug 26, 2020

Description

The generalized process for migrating content from the existing site to the repository looks like this:

  1. Fetch raw HTML
  2. Sanitize and normalize the HTML
  3. Create a new MDX file
  4. Add frontmatter details for the page
  5. Add the HTML content from step 2
  6. Verify the page is working correctly

In order to complete the process, we need to verify the page is working correctly. When we run npm run build on our machines (or the code builds on Amplify) Gatsby will return an error if a page is not working correctly. We can probably use this to identify if a page is not building. That said, we need a way to identify all the pages that have issues, not just the first one before the build quits.

If there's 1 page broken, we fix the page. If there are 100s of pages broken, we need to fix the migration code. Knowing the number of pages (and which pages) are broken is critical.

Related Issue(s)

After reviewing the remaining research tasks, we have decided to break the work in #12 down into multiple smaller tickets. This is one of those tickets.

Success Criteria

  • Determine how we verify if a page is broken (is npm run build output sufficient)
  • Identify how we can get a list of pages that are not working post-migration
@zstix
Copy link
Contributor Author

zstix commented Aug 27, 2020

We have confirmed that Gatsby will only throw an error for the first instance of an error and will not list all of the pages that are not working. We are going to look into what our options are to get more rich set of information.

@zstix
Copy link
Contributor Author

zstix commented Sep 1, 2020

Unfortunately, there does not seem to be a way to easily determine all the pages that are not building successfully. Here's a recap of things we have tried:

npm run build
This will fail on the first occurrence of an issue and does not provide insight into what other pages are having issues.

Ping pages locally for HTTP status
I created a script that would ping all the pages found on the sitemap. The thinking was that a failing page might throw a 400-level status code we could use to determine if a page was failing. Unfortunately, the issue is purely client side and happens after the original page load (so we couldn't scrape the page content either).

Compare built files to expected files
When we run npm run build, static HTML files get generated in the public folder. We thought about comparing this to the expected output, but this also does not work due to Gatsby's fail-on-first-bad-file mechanism.


My suggestion would be to catch errors before we get to Gatsby. During the sanitize and normalize step, we should make sure to capture and errors. When it comes time to build the site post-migration, we should keep an eye out for patterns. If we find that we have to fix the same error multiple times, we should update the migration code and run it again.

Lastly, we may have the option to use the raw HTML with Gatsby instead of pasting it into the MDX file. This has it's own set of challenges, but is an option.

I'm going to close this ticket as I believe we have identified all that we can given the current setup.

@zstix zstix closed this as completed Sep 1, 2020
mfulb pushed a commit to mfulb/docs-website that referenced this issue Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant