Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate pdf version of tutorial #58

Merged
merged 7 commits into from
Sep 16, 2016
Merged

generate pdf version of tutorial #58

merged 7 commits into from
Sep 16, 2016

Conversation

larskotthoff
Copy link
Contributor

Changed build to generate PDF version of tutorial. This is generating the tutorial with external PDF figures, collating the markdown files, fixing the internal links, adding a title, converting to PDF. Then generating the tutorial with embedded SVG figures. PDF version will be committed with the rest of the tutorial.

This is generating the entire tutorial twice, but I'm not sure we care about it since it happens on Travis anyway.

Fixes #50.

mlr-tutorial.pdf

@larskotthoff
Copy link
Contributor Author

Build broken because of other issues. Wait with merge until fixed.

@jakob-r
Copy link
Contributor

jakob-r commented Aug 29, 2016

I am honestly impressed by the numbers of pages and that it looks so good in most parts but there are some issues still left

  • Big Tables have a lot of overlaying text and are more or less unusable
  • Figures maybe are generated a bit too big so that the text is a tiny bit to small in the end. I kind of like this look but the others should have a look as well
  • The contents index should be clickable
  • Do the external links work? Not for me. Yes they do work.
  • Each page should have a header mentioning the section (and subsection?)
  • somewhere there should be the authors mentioned

Personally as I mentioned before. I would try do drop the "Integrated *" parts in the pdf version because

  • It does not look good
  • It is outdated quite fast

@schiffner
Copy link
Contributor

Do the external links work? Not for me.

For me they work, the links to rdocumentation as well as links to papers and other websites.

@larskotthoff
Copy link
Contributor Author

  • Big tables: Don't know how to fix that. Some of them may work in landscape (though it's not obvious to me how to do that), but not all of them.
  • Figures: We can probably set some kind of magnification somewhere? Didn't see anything obvious though.
  • Are you talking about the table of contents? Those should be links.
  • Headers and footers should be easily addable later (https://tex.stackexchange.com/questions/139139/adding-headers-and-footers-using-pandoc).
  • Authors are no problem either. What's the author list?

Dropping the integrated* parts would cause some broken links throughout. I don't see the argument for them being outdated quickly as we regenerate the PDF every time the tutorial is built.

@larskotthoff
Copy link
Contributor Author

Headers now added, see new attachment -- they overlap on some pages, but can be customised in the usual LaTeX way.

mlr-tutorial.pdf

@berndbischl
Copy link
Contributor

  1. First of all, i didnt want to be "rude" to Lars.

  2. Thanks a bunch! This is very cool! Apparently we don't have to write the mlr book :) It exists.

  3. WRT the appendix tables. I see possibilities to change the layout here, but i think we should drop them as @jakob-r suggested.

Authors are no problem either. What's the author list?

Just take everybody how did a commit on the repo?
And we can ask over slack whether we have forgotten someone.

  1. Is it possible top have some chapter numbering?

Thanks again !

@schiffner
Copy link
Contributor

I haven't found out yet where/how this is happening, but none of the figures is where it is supposed to be. For example on p. 24 there should be a scatterplot instead of the threshold-vs-perf plot and p. 35 shows ROC curves instead of the learner-prediction plots.

@larskotthoff
Copy link
Contributor Author

Argh, the filenames generated by knitr are not unique. I'll fix that.

@larskotthoff
Copy link
Contributor Author

  • Figures fixed, thanks for noticing!
  • Appendices are no longer part of the PDF version.
  • Authors added in order of number of commits (who is IcedragonP?)
  • Numbering added.

mlr-tutorial.pdf

@schiffner
Copy link
Contributor

schiffner commented Aug 29, 2016

Thanks very much.

who is IcedragonP

I think this is @PhilippPro (?)
EDIT: Yep. I'm sure now, you can see this in PR #13 for example.

@schiffner
Copy link
Contributor

Regarding the author list: Did you also look in the old mlr gh-pages branch?

  • We need to add Michel who wrote the wrapper page and did many other nice things.
  • Tobias Kühn wrote the imbalanced classification page. (He is probably not in the history at all because he sent his stuff by email and I committed it for him.)
  • And studerus and Dominik Kirchhoff had one commit each.

@berndbischl
Copy link
Contributor

We need to add Michel who wrote the wrapper page and did many other nice things.
Tobias Kühn wrote the imbalanced classification page. (He is probably not in the history at all because he sent his stuff by email and I committed it for him.)

yes (twice)

who is IcedragonP
I think this is @PhilippPro (?)

it is either P Probst, who must be added. Or somebody irrelevant.

And studerus and Dominik Kirchhoff had one commit each.

Did they do relevant, larger changes or just mini commits?

@berndbischl
Copy link
Contributor

authors: giuseppe did the classifier calibration stuff.

@berndbischl
Copy link
Contributor

WRT numbering and structure: thx much better now.
can we still have the "superstructure" with basic, advanced, extend be somehow visible in there?

@schiffner
Copy link
Contributor

authors: giuseppe did the classifier calibration stuff.

Zach wrote this (https://github.com/mlr-org/mlr-tutorial/commits/gh-pages/src/classifier_calibration.Rmd)

@schiffner
Copy link
Contributor

And studerus and Dominik Kirchhoff had one commit each.

Did they do relevant, larger changes or just mini commits?

studerus: no
Dominik: wrote a small section about the survival learners on the create learner page

@jakob-r
Copy link
Contributor

jakob-r commented Aug 30, 2016

Figures: We can probably set some kind of magnification somewhere? Didn't see anything obvious though.

This is actually easily done by setting the internal drawing size of the graphic device a bit smaller.
This would be my goto values:

opts_chunk$set(
  fig.width = 8,
  fig.height = 5
)

@larskotthoff
Copy link
Contributor Author

  • Fixed author list.
  • Introduced Basic/Advanced/Extend.
  • Scaled figures.

mlr-tutorial.pdf

@PhilippPro
Copy link
Contributor

IcedragonP = Philipp Probst

So you can delete IcedragonP.

@schiffner
Copy link
Contributor

schiffner commented Aug 30, 2016

Thx very much, Lars!!!

Re: Figures
Sorry for not doing this sooner, I just looked a little more closely at the figures.

  • The figures with very small text / thin lines result because we set relatively large fig.height's and fig.width's in code chunks in the Rmd files. The settings in the build file don't help in these cases because they are overwritten by the values in the chunks (e.g. p. 183).
  • For some figures in the pdf it would make sense to be quadratic or the height = 5, width = 8 setting does not fit that well.

So my proposal is

  • we don't scale the figures in the build file at all
  • I go through the tutorial and set the aspect ratios correctly (instead of width and height).

@larskotthoff
Copy link
Contributor Author

Ok, sounds good. There are many imperfections anyway and we can't fix all of them now.

I've fixed the author list.

@schiffner
Copy link
Contributor

Ok, sounds good. There are many imperfections anyway and we can't fix all of them now.

Unfortunately... But I fixed most of the plots now.
Wanted to fix the table in the visualization section next.

@schiffner
Copy link
Contributor

We need to add Janek to the author list.

@schiffner
Copy link
Contributor

schiffner commented Aug 31, 2016

Small issues:

  • Broken links to Appendices
  • Overflowing output in code chunks
  • Code indentation seems to be 4 spaces if lines are broken automatically
  • Tables (e.g in Visualization section): possible solutions:
    change the formatting, must all these tables be tables?
  • headers
  • Language fits a web page and not a pdf
  • explanations and plots are sometimes far apart

@larskotthoff
Copy link
Contributor Author

What's the verdict on this? Fixing all the minor issues will take a lot of effort and time. I vote to merge this now so that we have a reasonable, but not perfect PDF, and improve this later.

@schiffner
Copy link
Contributor

I'm also pro merging.

@berndbischl
Copy link
Contributor

I'm also pro merging.

then do it.

@larskotthoff
Copy link
Contributor Author

Ok, why didn't this squash the commits when I merged? It seems to be set up to do that?

@berndbischl
Copy link
Contributor

  1. if the tutorial exists now as PDF, can we pls put in on arxiv ASAP? as that was the motivation.

Ok, why didn't this squash the commits when I merged? It seems to be set up to do that?

the settings now seem to allow squashing merges and normal, multi-commit merges.
the normal mlr repo only allows squash merges. i will change this here, now too.
but the next time: simply check the options for this. this is the only thing that influences thus.
(and where you click....)

@schiffner schiffner mentioned this pull request Sep 19, 2016
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate the tutorial as pdf
5 participants