Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usage with GitLab CI #118

Closed
oupala opened this issue Oct 18, 2017 · 31 comments
Closed

usage with GitLab CI #118

oupala opened this issue Oct 18, 2017 · 31 comments

Comments

@oupala
Copy link
Contributor

oupala commented Oct 18, 2017

How would you use decktape with GitLab CI?

I tried many configuration, but it always fails with the following error message:

filename argument is required

I use the following .gitlab-ci.yml configuration file:

export:
  image: astefanutti/decktape
  script:
  - decktape http://url.example.com/ slides.pdf
  artifacts:
    paths:
    - slides.pdf
@astefanutti
Copy link
Owner

I don't know GitLab CI, though it seems like the command isn't right. It should execute something like:

 $ decktape [options] [command] <url> <filename>

Here it seems the filename is missing for some reasons. You may have to change the script attribute or declare a service. You can find some documentation here: https://docs.gitlab.com/ce/ci/docker/using_docker_images.html.

I'm closing this as this is more related to GitLab CI only, but feel free to open a new issue in case it something to be fixed in DeckTape.

@oupala
Copy link
Contributor Author

oupala commented Oct 18, 2017

The problem is that your example give some advice on how to launch a container using decktape image and some parameters.

With GitLab CI, the script section of the .gitlab-ci.yml file is launched from the inside of the container. I mean, we are already inside a container. So I think I am missing something in GitLab CI or in decktape logic.

I'll continue to search and I'll come back here if I find something interesting.

But if someone else has already solved this issue, please comment here!

@carstencodes
Copy link

carstencodes commented May 5, 2019

@oupala

If this is still important to you:

According to the gitlab documentation, you must overwrite the entrypoint for this:

 export:
  image: 
    name: astefanutti/decktape
    entrypoint:
     - "node"  
     - "/decktape/decktape.js" 
     - "reveal" 
     - "--chrome-path" 
     - "chromium-browser" 
     - "--chrome-arg=--no-sandbox"
     - "/builds/<namespaceofyourrepo>/<nameofyourrepo>/slides.html" 
     - "/builds/<namespaceofyourrepo>/<nameofyourrepo>/slides.pdf"
  script:
  - echo "Hello, world"
  artifacts:
    paths:
    - slides.pdf

In your example:

 export:
  image: 
    name: astefanutti/decktape
    entrypoint:
     - "node"  
     - "/decktape/decktape.js" 
     - "reveal" 
     - "--chrome-path" 
     - "chromium-browser" 
     - "--chrome-arg=--no-sandbox"
     - "http://url.example.com/" 
     - "slides.pdf"
  script:
  - echo "Hello, world"
  artifacts:
    paths:
    - /slides/slides.pdf

This should do.

Regards

Carsten

@oupala
Copy link
Contributor Author

oupala commented May 16, 2019

Your anwser comes long long after my question. Thanks a lot @carstencodes for replying and sharing your solution.

Unfortunately, it does not work for me at the moment. Could you please help me debugging my problem?

Here is my execution log:

Checking out 420f7bc4 as master...

Skipping Git submodules setup
Loading page file:///builds/<namespaceofyourrepo>/<nameofyourrepo>/index.html ...

Error: Navigation failed because browser has disconnected!
Loading page file:///builds/<namespaceofyourrepo>/<nameofyourrepo>/index.html ...

Error: Navigation failed because browser has disconnected!
ERROR: Job failed: exit code 1

I did have replaced <namespaceofyourrepo>/<nameofyourrepo> by my gitlab namespace and repo name in my gitlab-ci.yml.

Do you understand why does chome disconnect? And how to debug this? As everything is local to the container, there should not be any network issue...

@oupala
Copy link
Contributor Author

oupala commented Mar 23, 2020

I went back to work to try to make decktape work with GitLab CI and I now have a new problem:

from origin 'null' has been blocked by CORS policy

Do you understand where does this problem comes from?

Loading page file:///builds/group/project/index.html ...
00:02
Access to XMLHttpRequest at 'file:///builds/group/project/content/project/index.md' from origin 'null' has been blocked by CORS policy: Cross origin requests are only supported for protocol schemes: http, data, chrome, https.
Page error: Error
    at XMLHttpRequest.s.onerror (file:///builds/group/project/js/remark.min.js:17:19473)
    at c (file:///builds/group/project/js/remark.min.js:17:19496)
    at new r (file:///builds/group/project/js/remark.min.js:17:20793)
    at r.create (file:///builds/group/project/js/remark.min.js:1:21010)
    at file:///builds/group/project/index.html:23:32
Loading page finished with status: 0
Remark JS plugin activated
Error: Evaluation failed: DOMException: Failed to read the 'rules' property from 'CSSStyleSheet': Cannot access rules
    at __puppeteer_evaluation_script__:4:20
ERROR: Job failed: exit code 1

As far as I understand, the problem is that I'm using a local file protocol schemes. But I do not have any web server to serve my remark presentation as the pipeline is about deploying the presentation to GitLab pages.

@carstencodes can you confirm that the tip you gave me about a year ago still works for you?

@astefanutti
Copy link
Owner

@oupala, this is likely due to Chromium security constraints. Could you try with:

$ decktape --chrome-arg=--allow-file-access-from-files ...

@oupala
Copy link
Contributor Author

oupala commented Apr 16, 2020

I started to work on this again, and I get some success while simplifying the problem: I'm now trying to generate a pdf from an external resource instead of a local slideshow (taken from the git repository).

This is the GitLab CI job:

export:
  image: 
    name: astefanutti/decktape
    entrypoint:
    - "node"  
    - "/decktape/decktape.js" 
    - "remark" 
    - "--chrome-path" 
    - "chromium-browser" 
    - "--chrome-arg=--no-sandbox"
    - "--chrome-arg=--allow-file-access-from-files"
    - "http://example.org/presentation"
    - "slides.pdf"
  script:
  - echo "Hello, world"
  artifacts:
    paths:
    - /slides/slides.pdf

And here are the logs:

Running with gitlab-runner 11.10.1 (1f513601)
  on Docker-in-Docker Shared Debian multi Runner 9e412ff5
Using Docker executor with image astefanutti/decktape ...
Pulling docker image astefanutti/decktape ...
Using docker image sha256:7e3ec71ef2df7104584368d8206a7b61c673481118d22f4bac34c20d50df731c for astefanutti/decktape ...
Running on runner-9e412ff5-project-28411-concurrent-0 via docker...
Reinitialized existing Git repository in /builds/group/project/.git/
Fetching changes...
From https://gitlab.com/group/project
 * [new ref]         refs/pipelines/1419057 -> refs/pipelines/1419057
   078e9d1..3ee59d1  master                 -> origin/master
Checking out 3ee59d1c as master...
Skipping Git submodules setup
Loading page http://example.org/presentation ...
Loading page finished with status: 200
Remark JS plugin activated
Printing slide #17      (17/17) ...
Printed 17 slides
Loading page http://example.org/presentation ...
Loading page finished with status: 200
Remark JS plugin activated
Printing slide #17      (17/17) ...
Printed 17 slides
Uploading artifacts...
WARNING: /slides/slides.pdf: no matching files     
ERROR: No files to upload                          
Job succeeded

I can see 2 strange things:

  1. the pdf generation seems to be executed twice
  2. the job cannot see the generated pdf in order to make it available as an artifact

@oupala
Copy link
Contributor Author

oupala commented Apr 16, 2020

I also have problem generating a pdf from a local source file. I think that I have global problem with accessing files from inside the container, either on reading, or on writing (after generating the pdf, I have to take it outside from the container).

If anyone has an idea...

@carstencodes
Copy link

2. the job cannot see the generated pdf in order to make it available as an artifact

Just a guess, but you're using an absolute path for the artifact, but the entrypoint uses a relative path. Have you either tried to use an absolute path for the generation or a relative path for the artifact?

@oupala
Copy link
Contributor Author

oupala commented Apr 16, 2020

Just after my previous comment, I tried the following job:

export:
  image: 
    name: astefanutti/decktape
    entrypoint:
    - "node"  
    - "/decktape/decktape.js" 
    - "remark" 
    - "--chrome-path" 
    - "chromium-browser" 
    - "--chrome-arg=--no-sandbox"
    - "--chrome-arg=--allow-file-access-from-files"
    - "http://example.org/presentation"
    - "/slides/slides.pdf"
  script:
  - echo "Hello, world"
  artifacts:
    paths:
    - /slides/slides.pdf

So I was using an absolute path for both path, and it did not work either.

@carstencodes
Copy link

Strange. Tomorrow I'll give it a try myself.

I tried it with my old configuration and it worked for me:

  • gitlab-runner 12.9
  • decktape 2.9
  • revealjs plugin

I'll keep you informed, if I get decktape 2.11 running ...

@astefanutti
Copy link
Owner

@oupala, it is possible you have to mount a volume so that the exported PDF is accessible from the outside of the container, by adding the following argument to the entrypoint .e.g.:

-v `pwd`:/slides

Besides, with recent versions of Chromium, you should use --chrome-arg=--disable-web-security instead of --chrome-arg=--allow-file-access-from-files to load file locally.

@oupala
Copy link
Contributor Author

oupala commented Apr 21, 2020

Thanks @astefanutti for you help. Nonetheless, I know about mounting volumes in docker. I'm using decktape successfully on my laptop with the following command line:

docker run --rm -v `pwd`:/slides astefanutti/decktape <url>

However, GitLab CI/CD behave differently and manage everything about volume. You don't run docker container by yourself, GitLab does. As a user, you only define jobs that will be executed by the CI/CD module of GitLab.

The way you can specify to make the exported pdf accessible from outside the container is by defining an artefact and its path. This is were I'm getting stuck.

Maybe @carstencodes can show me a working example so that I can find out where does my problem come from.

@oupala
Copy link
Contributor Author

oupala commented Apr 21, 2020

I made another try and I partially improved the situation: I can now generate a pdf and export it as an artefact.

The problem was in the artifact path: in the comment from @carstencodes on May 5, 2019 was proposing 2 differents syntaxes:

  artifacts:
    paths:
    - /slides/slides.pdf

or

  artifacts:
    paths:
    - slides.pdf

The second solution is the one working for me: path must just mention the filename of the pdf, without any path before.

    # in the entrypoiny
    - "/builds/<namespaceofyourrepo>/<nameofyourrepo>/export.pdf"
    # in the artifact definition
    - export.pdf

Generating a pdf now works if I use a remote url for the remark presentation.

But it still fails if I try to use a local remark presentation. I get the following error:

Loading page file:///builds/<namespaceofyourrepo>/<nameofyourrepo>/index.html ...
Page error: Error
    at XMLHttpRequest.s.onload (file:///builds/<namespaceofyourrepo>/<nameofyourrepo>/js/remark.min.js:17:19338)
Loading page finished with status: 0

Here is the configuration of the job I am using:

image: 
    name: astefanutti/decktape
    entrypoint:
    - "node"  
    - "/decktape/decktape.js" 
    - "remark" 
    - "--chrome-path" 
    - "chromium-browser" 
    - "--chrome-arg=--no-sandbox"
    - "--chrome-arg=--disable-web-security"
    - "/builds/<namespaceofyourrepo>/<nameofyourrepo>/index.html"
    - "/builds/<namespaceofyourrepo>/<nameofyourrepo>/export.pdf"

And the last strange thing is that the process ends with a zero return code, so GitLab thinks that everything is ok altough there was an error.

@carstencodes
Copy link

I cannot say why decktape.js exits with a 0 code. I'm not an expert on remark.js - actually I don't even know it. Maybe the exit code 0 is due to the fact, that chromium has loaded the file successfully, even though has rendering and processing issues, the file is there and can be loaded. According to the unofficial guide there is no option to set-up a behavior like this.

Does it run on your machine when executing this command in the docker container itself using docker exec or docker run?

Otherwise I must state, that I'm not having a clue.

@astefanutti
Copy link
Owner

Right, it seems remark.js cannot load the presentation for some reasons:

file:///builds/<namespaceofyourrepo>/<nameofyourrepo/js/remark.min.js:17:19338

Using a non-minimised version may help to pinpoint the root cause.

Then, I think Decktape does not catch the error correctly, while it should returns a non-zero code.

@oupala
Copy link
Contributor Author

oupala commented Apr 22, 2020

I replaced the minified version of remark by a non minified version (version 0.15.0), and we can now locate where the problem happens:

Loading page file:///builds/<namespaceofyourrepo>/<nameofyourrepo>/index.html ...
Page error: Error
    at XMLHttpRequest.xhr.onload (file:///builds/<namespaceofyourrepo>/<nameofyourrepo>/js/remark.js:26477:17)
Loading page finished with status: 0

The corresponding line (26477) in the js file is linked to loading the file:

  function loadFromUrl (url, callback) {
    var xhr = new dom.XMLHttpRequest();
    xhr.open('GET', options.sourceUrl, true);
    xhr.onload = function (e) { 
      if (xhr.readyState === 4) {
        if (xhr.status === 200) {
          options.source = xhr.responseText.replace(/\r\n/g, '\n');
          loadFromString(options.source);
          if (typeof callback === 'function') {
            callback(self);
          }
        } else {
          throw Error(xhr.statusText);
        }
      }     
    };    
    xhr.onerror = function (e) { 
      throw Error(xhr.statusText);
    };    
    xhr.send(null);
    return xhr;
  }

The line 26477 is the first throw Error of the extract.

As I'm trying to load a local file, I don't understand why line 2 is about an XMLHttpRequest.

@astefanutti
Copy link
Owner

Could you add a console.log statement to print the URL of the resource that’s being fetched?

It‘s possible there is an external file being requested.

@oupala
Copy link
Contributor Author

oupala commented Apr 22, 2020

I think I've found the reason why it doesn't work: remark can't load an external file from a local filesystem. I've read that on the wiki of remark:

When working locally, with your slideshow HTML opened directly from disk, using the sourceUrl won't work out of the box. This requires hosting your files using a web server, which can be accomplished in multiple ways.

The behavior of Firefox about CORS and file:/// schema changed in version 68. This is why I'm just discovering it now. And Chrome has probably adopted the same behavior. As decktape is based on puppeteer which is base on Chrome, there is no hope from the Chrome side.

I'm now looking for a workaround...

If anyone has an idea.

@oupala
Copy link
Contributor Author

oupala commented Apr 22, 2020

One solution would be that decktape is able to launch a small webserver so that decktape can generate a pdf from the webserver.

This is the advice of the documentation of remark:

This requires hosting your files using a web server, which can be accomplished in multiple ways, e.g. by running python3 -m http.server in the directory of your index.html file. With a web server up and running, say on port 8000, you should be able to access your files via http://localhost:8000.

What do you think of that?

@astefanutti
Copy link
Owner

Could you try with --chrome-arg=--allow-file-access-from-files?

@oupala
Copy link
Contributor Author

oupala commented Apr 24, 2020

@astefanutti to answer both of your last questions :

  • adding a console.log to output the loaded file shows that the requested file is the markdown file that contains the presentation: content/index.md (it is a local file)
  • adding a parameter while launching Chrome does not help, here is the command I use to test it in local (on my laptop): chrome --no-sandbox --disable-web-security --allow-file-access-from-files index.html

The problem occurs with Firefox and Chrome. The solution is a local webserver so that security issues are not raised.

With a small webserver to serve my presentation, I can successfully see my presentation in Chrome using the following command line: chrome --no-sandbox http://localhost/

@astefanutti
Copy link
Owner

@oupala thanks for the feedback.

It's not clear from the Chromium documentation what is the scope of the --allow-file-access-from-files flag. The --user-data-dir flag is mentioned as a solution in gnab/remark#388.

Would you be able to start the web server in GitLab CI? Technically, it'd be possible to do it in Decktape, but we've left that option to the user for now, and I'd be inclined to think it's not Decktape's responsibility.

@oupala
Copy link
Contributor Author

oupala commented Apr 24, 2020

The problem in fixing the issue on GitLab side is that the problem is not linked to GitLab. GitLab is a usecase among a lot of other usecases.

When I want to use decktape locally with docker to print my local remark presentation:

$ docker run --rm -v `pwd`:/slides astefanutti/decktape index.html test.pdf
Loading page file:///slides/index.html ...
Access to XMLHttpRequest at 'file:///slides/content/index.md' from origin 'null' has been blocked by CORS policy: Cross origin requests are only supported for protocol schemes: http, data, chrome, https.

Page error: Error
    at XMLHttpRequest.xhr.onerror (file:///slides/js/remark.js:26483:13)
    at loadFromUrl (file:///slides/js/remark.js:26485:9)
    at new Slideshow (file:///slides/js/remark.js:26438:5)
    at Api.create (file:///slides/js/remark.js:1851:15)
    at file:///slides/index.html:23:32
content/index.md
Loading page finished with status: 0
Remark JS plugin activated

Error: Evaluation failed: DOMException: Failed to read the 'rules' property from 'CSSStyleSheet': Cannot access rules
    at __puppeteer_evaluation_script__:4:20

And if I use the non-docker version of decktape:

$ node decktape.js --chrome-arg=--no-sandbox /path/to/index.html test.pdf
Loading page file:///home/user/presentation/content/index.html ...
Access to XMLHttpRequest at 'file:///home/user/presentation/content/index.md' from origin 'null' has been blocked by CORS policy: Cross origin requests are only supported for protocol schemes: http, data, chrome, https.

Page error: Error
    at XMLHttpRequest.xhr.onerror (file:///home/user/presentation/content/js/remark.js:26483:13)
    at loadFromUrl (file:///home/user/presentation/content/js/remark.js:26485:9)
    at new Slideshow (file:///home/user/presentation/content/js/remark.js:26438:5)
    at Api.create (file:///home/user/presentation/content/js/remark.js:1851:15)
    at file:///home/user/presentation/content/index.html:23:32
content/index.md
Loading page finished with status: 0
Remark JS plugin activated

Error: Evaluation failed: DOMException: Failed to read the 'rules' property from 'CSSStyleSheet': Cannot access rules
    at __puppeteer_evaluation_script__:4:20

So, in m opinion, the problem is not in GitLab.

By the way also tried the same test using the following parameter in addition to others: node decktape.js --chrome-arg=--no-sandbox --chrome-arg=--user-data-dir /path/to/index.html test.pdf. The result was the same error.

We cannot even blame remarkjs as the limitation comes from the browsers, Firefox and Chrome.

We cannot blame Firefox and Chrome as they add security features to help the user, and they are following the W3C recommendations.

We cannot blame decktape as decktape was working perfectly until the browsers added security features.

But, it will be useless to ask remarkjs to change as they don't have control over browsers' security features.

It will be useless to ask the browsers to move as they already moved to enhance security.

This is why I think the solution is to enable a webserver into decktape.

What's your though on this?

@astefanutti
Copy link
Owner

Yes, embedding a web server in Decktape is an option technically. I still need to have user input on this to see how much it's needed vs. the extra effort and complexity. I see your pain, yet it seems a specific use case with Remark using external file in a CI environment. My current understand is that users that develop presentations on their local environments, are usually capable of using a local web server, as they use it already to render their presentations.

@oupala
Copy link
Contributor Author

oupala commented Apr 27, 2020

In fact, the issue that made me create this issue solved.

The problem we are now discussing is another problem. Should I open a new issue?

I think the issue is not specific to remark as it is about all presentations where the content is get from a separate file: the logic in an html file, the content in another file (somehow a markdown file).

It is not only a CI problem as some user might want to download a presentation on GitHub or GitLab and convert it to pdf. They are not necessarily expert enough to download and start a webserver. There would probably not be able to understand why the presentation is not displayed in the browser.

This is why I think it could be the job of decktape to handle such complexity to make user's life easier.

I understand your point of view that this would add complexity to decktape, which we should avoid if it is not necessary. However, I think that this would not be useless complexity in this case.

@astefanutti
Copy link
Owner

Yes, feel free to create a new issue, so that it's about better supporting local presentations with external resources, and let other users weight in.

For users that retrieve such presentations locally, documenting how to serve the assets may be sufficient. I'd also like to dig into that --allow-file-access-from-files option, to understand why it does not do what it looks like it should be doing!

@oupala
Copy link
Contributor Author

oupala commented Apr 28, 2020

Issue created.

I will also try to find out what is the purpose of --allow-file-access-from-files option.

Edit: some people had the same problem as us.

@oupala
Copy link
Contributor Author

oupala commented May 29, 2020

I found a website listing all the flags for Chrome and it appears that the usage of --allow-file-access-from-files is documented as:

By default, file:// URIs cannot read other file:// URIs. This is an override for developers who need the old behavior for testing.

So, this flag should allow a file:// resource to link to other file:// resources. As you said, I also wonder why the flag does not work as intented.

By the way I tested using Chrome version 81.0.4044.0 (developement build) (64 bits).

@AntonShevchuk
Copy link

AntonShevchuk commented May 23, 2023

I want to share with community our solution, and I hope it will help somebody.
We need to convert many files from asciidoc to html and then to PDF files.

As result we received 2 stages - one for HTML pages and one for PDF files:

stages:
  - build
  - deploy
  - export
  
pages:
  stage: deploy
  script:
    # generate HTML files from asciidoc
    - npm run build *.adoc */*.adoc
    # prepare directory with resources for web
    - mkdir .public
    - cp -r javascripts .public
    - cp -r stylesheets .public
    - cp -r node_modules/reveal.js .public/javascripts
    # copy all HTML files
    - find . -type f -not -path '*/\.*' -not -path '*/node_modules/*' -name "*.html" -exec cp --parents -t "./.public" {} +
    # copy all images directories
    - find . -type d -name "images" -not -path '*/\.*' -not -path '*/node_modules/*' -exec cp -r --parents -t "./.public" {} +
    # rename to public, this is required for generate artifacts
    - mv .public public
    - ls -la ./public
  artifacts:
    paths:
      - public

export:
  stage: export
  image:
    name: astefanutti/decktape
    entrypoint: [""]
  script:
    - mkdir export
    - echo "Prepare PDF files"
    # generate PDF files from the HTML files
    - |+
      for file in $(find ./public -type f -not -path '*/\.*' -not -path './public/javascripts/*' -name "*.html")
      do
        node /decktape/decktape.js --chrome-path chromium-browser $file "./export/${file:9:length-5}.pdf"
      done
  artifacts:
    paths:
      - export

@jpggithub
Copy link

Thanks a lot @AntonShevchuk for sharing your solution!

On my gitlab configuration, I had to change the call of decktape by adding --chrome-arg=--no-sandbox for having something like node /decktape/decktape.js --chrome-path chromium-browser --chrome-arg=--no-sandbox ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants