Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

directory vs .md file #101

Closed
q5sys opened this issue Jan 29, 2021 · 11 comments
Closed

directory vs .md file #101

q5sys opened this issue Jan 29, 2021 · 11 comments

Comments

@q5sys
Copy link

q5sys commented Jan 29, 2021

I'm not 100% sure if this is something I'm screwing up or not, so if that's the case let me know what I'm doing wrong so I can fix it.

I'm using Hugo, so all .md files get turned into directories when built.
So if I have /content/article/newthing.md in my repo that will get turned into website.com/content/article/newthing/ when built and published online

so if I want to link to that page from another page I would use /content/article/newthing/ as the link, because /content/article/newthing.md will not exist on the website when published.

But this leads to practically every link being flagged as bad because /content/article/newthing/ doesn't exist in the repo... /content/article/newthing.md does... and the link checker doesn't know there's one in the same.

Is there something in the config I can set to fix this... or is this just not something the link checker can do?

@gaurav-nelson
Copy link
Owner

Thanks @q5sys for creating this issue. You can (maybe) handle this by using a custom config with the BASEURL option. In your custom configuration file use the following:

{
  "projectBaseUrl": "website.com",
  "replacementPatterns": [
    {
      "pattern": "^/",
      "replacement": "{{BASEURL}}/"
    }
  ]
}

or

{
  "projectBaseUrl": "website.com",
  "replacementPatterns": [
    {
      "pattern": "^/",
      "replacement": "{{BASEURL}}/index.html"
    }
  ]
}

Try an let me know if that works, is your repository publicly available? Share a link here if it is.

@q5sys
Copy link
Author

q5sys commented Feb 1, 2021

Thanks for the reply, I'll try that when I get time. I'm a bit confused how using {{baseurl}} will make a difference in the .md vs / issue, but I'm willing to give anything a try if it helps resolve the issue.

The repo is public, its the TrueNAS Documentation repo, an example of the run throwing hundreds of errors is here:
https://github.com/freenas/documentation/runs/1735428478?check_suite_focus=true
And the site gets built and published here: https://www.truenas.com/docs/hub/

@gaurav-nelson
Copy link
Owner

gaurav-nelson commented Feb 1, 2021

@q5sys

I'm a bit confused how using {{baseurl}} will make a difference in the .md vs / issue,...

Here is an example to make it more clear.

{
  "projectBaseUrl": "https://www.truenas.com/docs",
  "replacementPatterns": [
    {
      "pattern": "^/",
      "replacement": "{{BASEURL}}/"
    }
  ]
}
  • The link that fails in the content/en/hub/sharing/smb/smb-share.md document is /hub/sharing/smb/smb1/ (based on failed check output)
  • When you'll use the BASEURL option, the link which the markdown-link-check checks, becomes https://www.truenas.com/docs/hub/sharing/smb/smb1/ which exists so the test will pass.
  • If that fails (I don't think it will) you can use the other option with /.index.html so the link to check becomes https://www.truenas.com/docs/hub/sharing/smb/smb1/index.html which is essentially the same and the test should pass.

@gaurav-nelson
Copy link
Owner

The example in my previous comment will only work for existing published pages. You can also try the following:

{
  "projectBaseUrl": "content/en",
  "replacementPatterns": [
    {
      "pattern": "\/$",
      "replacement": ".md"
    },
    {
      "pattern": "^/",
      "replacement": "{{BASEURL}}/"
    }
  ]
}

I am not sure if it will work. However, the logic should replace / with .md and then insert the full path to actual md file.

@q5sys
Copy link
Author

q5sys commented Feb 2, 2021

Ok, thanks for the additional information. I should be able to get time today to work on it.

@q5sys
Copy link
Author

q5sys commented Feb 2, 2021

I tried out the substitution you suggested and while it did work, it ended up having the unintended consequence of breaking every link off site that we link to. I tried various things to see what else I could work out, but I think this is going to take a much more complex regex, so I'll keep working on it.
Thanks for your suggestions.

@q5sys
Copy link
Author

q5sys commented Feb 2, 2021

Looking through the docs you have, I had an additional question. You have a replacementPatterns as something that can be done with the config file, do you have something that would be a ignorePatterns, so that I could regex other more complicated links issues that I'm still running into? I know I can ignore links in the markdown files themselves with your HTML code, but I was thinking of perhaps having 2 actions, one to check all of the "/ vs .md" links and then another to check everything else. I'd just need a way to set the config.json file on each action to ignore the links that would fall into the category that would be checked the other link check action.

@gaurav-nelson
Copy link
Owner

@q5sys Yes.

  • You can use ignorePatterns PS https://github.com/tcort/markdown-link-check#config-file-format
  • The example I listed was a sample, you can refine the regex even more to only replace specific links. For example, to only replace links that begin with /hub/ and end with a / use the regex (?<=^\/hub.+)\/$.
  • for links containing # you can also add another replacement
      {
        "pattern": "\/#(?=.+)",
        "replacement": ".md#"
      },

I tried out the substitution you suggested and while it did work, it ended up having the unintended consequence of breaking every link off site that we link to.

Please always include links to failed checks, it makes it easy to debug. The idea of having two check runs is also good, let me know how you solve this issue and I can add it to the README file.

@q5sys
Copy link
Author

q5sys commented Feb 3, 2021

Thanks for the additional info, I really appreciate your help. Is there anyway I can buy you a beer or a coffee? haha. The replies you've given have definitely helped cleaned up a lot of the false positives that I was running into, but I'm still running into a few odd issues. Once I get these figured out, I'll write up a little troubleshooting thing and do a PR into the ReadMe in your repo.

I've tried to make the links with # alot simpiler but they are still failing: https://github.com/freenas/documentation/pull/524/checks?check_run_id=1825026546#step:4:97
The file definitely exists: https://github.com/freenas/documentation/blob/Docs-1631-test/content/en/hub/initial-setup/support.md
Here's the current config I'm working with:
https://github.com/freenas/documentation/blob/Docs-1631-test/static/config.json

I'm not sure what's going wrong here:
https://github.com/freenas/documentation/pull/524/checks?check_run_id=1825026546#step:4:772

And then I've got a really odd situation of this it is able to scan this file, but cant find references back to itself:
https://github.com/freenas/documentation/pull/524/checks?check_run_id=1825026546#step:4:1267
How can it not find the file it's currently reading?
File exists here: https://github.com/freenas/documentation/blob/Docs-1631-test/content/en/hub/intro/COREHardwareGuide.md

And in that same file I get issues where it doesn't find files which are there, this seems at this point to be majority of the issues I have left:
https://github.com/freenas/documentation/pull/524/checks?check_run_id=1825026546#step:4:1280
File exists here: https://github.com/freenas/documentation/blob/Docs-1631-test/content/en/hub/initial-setup/storage/sed-drives.md

@gaurav-nelson
Copy link
Owner

gaurav-nelson commented Feb 5, 2021

@q5sys I tried your repository with https://github.com/gaurav-nelson/documentation/blob/8523b2584808e1db03dafef83fc39dae2560e669/static/config.json

The problems I see are:

  1. All links in your site are lowercase vs some filenames are mixed case.
  2. Some links point to folder names (which in turn show the file _index.md)

I think you'll need a custom script to cater for all these different cases.

@q5sys
Copy link
Author

q5sys commented Mar 3, 2021

I'm going to close this ticket out as its clear that this is not a bug it's just a complicated configuration issue. Thanks for your assistance with tweaking the regex.

@q5sys q5sys closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants