Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Manual Beta7 Dev Branches and PRs #71

Closed
13 tasks done
tajmone opened this issue Sep 16, 2020 · 12 comments
Closed
13 tasks done

Fix Manual Beta7 Dev Branches and PRs #71

tajmone opened this issue Sep 16, 2020 · 12 comments
Assignees
Labels
📖 Alan Manual Issues relating to "The Alan Language Manual" ⚠️ important Priority: High
Milestone

Comments

@tajmone
Copy link
Collaborator

tajmone commented Sep 16, 2020

Clean up the dev branches for the upcoming Beta7 release of the ALAN Manual (the first official release):

  • Rebase beta7-prep on master, solving conflicts as required.
  • Rebase all beta7-* dev branches on the fixed beta7-prep branch:
    • AppG-i18n-squash
    • beta7-prep-PDF-gaps
    • handle_elisions (was already merged)
  • Review and merge/reject all pending PRs of beta7-* dev branches — rebase and fix them as needed and then merge them, or delete them if rejected:

After all PRs are merged, we should consider deleting left over branches:

  • AppG-i18n (the unsquashed version of AppG-i18n-squash)
  • AppG-i18n-squash
  • beta7-prep-PDF-gaps
  • handle_elisions

@thoni56, we need to unravel the conflicts with the dev branches and master, so we'll be able to merge/squash without problems when Manual Beta7 is ready for release.

I'll try to reabse beta7-prep on master, and then all other beta7-* branches on beta7-prep. Then we really need to merge the beta7-* branches back into beta7-prep, so if you could start reviewing them we'll be able to save some time (not sure why those PRs have been hanging ther for so long, but after such a long time is going to be harder to review them and catch up).

As I start to get a clearer picture of the required steps, I'll be updating this entry post and the task list above.

@tajmone tajmone added 📖 Alan Manual Issues relating to "The Alan Language Manual" ⚠️ important Priority: High labels Sep 16, 2020
@tajmone tajmone added this to the Alan Beta7 milestone Sep 16, 2020
@tajmone tajmone added this to To do in Alan Manual Beta7 Release via automation Sep 16, 2020
@tajmone tajmone pinned this issue Sep 16, 2020
@tajmone
Copy link
Collaborator Author

tajmone commented Sep 16, 2020

Merge PRs First, Fix Conflicts After

@thoni56, maybe the simpler approach is to first review and accept the PRs on beta7-* dev branches, since they are in synch with the beta7-prep branch, and then fix conflicts on beta7-prep by rebasing it on master branch.

This way we'll have to focus on a single branch, instead of 4.

@thoni56
Copy link
Contributor

thoni56 commented Sep 16, 2020

I used https://draftable.com to easily diff two pdfs. It doesn't detect whitespace changes, so it misses this exact PR, but it is easy to scroll two PDFs side by side giving you this view:
image

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 17, 2020

Rebasing All PRs on Master after EditorConfig Cleanup

@thoni56, as mentioned in #72, I've started to enforce consistent code styles via EditorConfig in dev-editorconfig (branched from master).

What I would like try, locally first, is:

  1. Create backup branches of master and all the beta7-* dev branches.
  2. Merge dev-editorconfig into master.
  3. With each beta7-* PR:
    • Rebase the PR on branch the new master branch with enforced code styles consistency.
    • Cleanup the code styles of every changed file in the PR, until they pass the validation.sh checks.
    • Squash all PR commits into a single commit (should reduce conflicts).
  4. At this point, if all merge conflicts are resolved, start merging the approved PRs, and rebasing on beta7-prep all the pending PRs after each merge.
  5. After checking that all is good to go, force push all the rebased PR branches to origin (but still keep their backups locally).

This means that we should wait before merging any PR into beta7-prep — otherwise we'll have to fix the code styles and amend every commit of beta7-prep, which is harder than doing it on each PR before it's merged.

Since the dev-editorconfig branch also fixes code styles in many sources (some of which are involved in the PRs' changes), this seems the only feasible approach without breaking the PR branches.

Also, this should not only help to solve all current conflicts, but also make it easier to keep rebasing beta7-prep on master in the future — after all, one of the purposes of EditorConfig settings is to eliminate much of the noise that usually breaks history linearity due to meaningless content changes introduced by editors differences.

What do you think? Is this the right way to go? Am I mistaken, or missing out something?


P.S. — The repository already contained an .editorconfig settings file, but it only covered Alan sources and the .a3sol/.a3log solutions/transcripts files, to enforce Latin1 encoding, trimming trailing spaces and enforcing an empty line at the file end (to prevent, respectively, ALAN compilation errors, erratic spaces in transcripts, and incomplete transcript sessions). So it was more of a quick fix than a well-thought enforcement of consistent code styles.

The current update will be covering all the file types used in the project, enforcing good styles conventions on the various editors/IDEs, and — last but not least — validating all commits and PRs via Travis CI, to prevent noise in contributions.

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 17, 2020

PDF Diffing

I used https://draftable.com to easily diff two pdfs. It doesn't detect whitespace changes, so it misses this exact PR, but it is easy to scroll two PDFs side by side giving you this view:

The PDF diffs of this tool look quite neat.

I use BeyondCompare 4 for diffing, which does offer specialized viewers for diffing various formats (Hex, images, CSV files, MP3, and others), including PDF, but the PDF diff previewer doesn't reproduce the actual styles, it only shows the text contents representation, without whitespace differences, but more like if it was a plain text file (therefore, no images either). I haven't used it much lately, so I don't remember the details, but surely it's not as neat as the dedicated tool you pointed out.

Both BeyondCompare an Draftable are commercial tools, so having already invested money on the former I'm reluctant to spend further money on another tool — having to renew license when there's a major update, as well as having to learn a new tool and keeping it updated on the machine — but, surely, Draftable is quite tempting for someone who has to diff PDFs frequently.

BeyondCompare 4 is a quite a powerful tool, with many features and customization options, and in over three years after its purchase I can't really say that I've managed to learn how to make the most out of it — can't find the time to dig into the docs and play around with it. I mostly use it at a basic level, although I did attempt to create an ALAN syntax highlighter for it (poor results, the syntax format is not very nice, seems buggy and has an horrible integration interface).

In the future I should really invest more energy on BC4, especially on customizing automated diffing reports on a per-project basis, which is one of the cool features that attracted me to BC4 (beside the possibility to integrate with Git and many Git GUIs). But then, I loath having to spend more time learning the tools of the trade instead of focusing on the trade itself (i.e. programming instead of tooling).

@thoni56
Copy link
Contributor

thoni56 commented Sep 17, 2020

Since the dev-editorconfig branch also fixes code styles in many sources (some of which are involved in the PRs' changes), this seems the only feasible approach without breaking the PR branches.

Because of this I think you are right. Although I haven’t consideted the merge of #40.

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 17, 2020

@thoni56, the good news is that I've finished enforcing code styles consistency across the repository (still in dev-editorconfig only) and none of the Manual AsciiDoc sources required any fixes!

So, even after merging into master the dev-editorconfig branch, these changes shouldn't affect the PRs rebasing operations (at least, not the AsciiDoc sources, which would be the trickier to handle).

Right now, in dev-editorconfig I've kept the sources fixes into individual commits per file type involved (i.e. all markdown fixes in one commit, and so on with shell scripts, Alan sources, AsciiDoc, etc.).

I was wondering if we should squash them into a single commit or just keep them separate. On the one hand, having split commits per file type might be more manageable in terms of revisiting the repo history (blaming, etc.); but on the other hand it makes a longer commit history (where we could squash 4 commits into a single one).

What do you suggest, based on your experience with long-running Git repos? Separate commits per file-type fixes, or all fixes squashed into a single commit?

@thoni56
Copy link
Contributor

thoni56 commented Sep 18, 2020

@tajmone Nice! And I like separate commits for the formatting fixes.

In my experience, there is seldom a problem with many commits, This is a problem that shows up when the development organization grows. Up to a couple of teams (~10) I recommend single-branch development and committing directly to master (or your main branch) because of the immediate feedback you get (and it forces you to run tests and what not, locally).

Again, in my experience, the value of a main branch with squashed commit so that you only get a single commit per "feature" only shows itself if you have a QA-department and process that inspects, tests and documents the new features in the release. And that's, luckily enough, no us ;-)

So the only reason to use branching in the alan-docs project is to keep a version synced with the releases of the SDK. OTOH we could do that by creating the HTML and PDFs from a tag which represents that sync point.

Of course, we could potentially find problems that we want to fix in the "released" documentation, but that is not a big deal. "bugs" in documentation can always wait for the next release. At least if you ask me.

(In some lean/agile philosophical perspective, branching pushes feedback, and work, to a later time, which is always something to watch out for. And avoid, or minimize, if you can.)

This became kind of a rant about branching, sorry for that, but I guess it's a background for my feeling that it doesn't matter much if we squash or not. We don't have the same "requirements" as large development organisations.

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 18, 2020

@thoni56, thanks for sharing your thoughts on branches strategy (I'll comment on them later, since they are very helpful).

Rebase Nightmare

Right now, I'm really struggling hard to reabse the beta7-prep on master — it all goes back to that "elisions with apostrophe" commit that was based on an old version of master, containing old XRefs. That, and the subtle changes in the various README files (sections being moved around, the autogenerated TOCs out of sync, etc.).

The problem is that there are too many commits in the divergent history, and it's hard to track what you're trying to fix at each rebase stage (with the hash references being hard to memorize, and the awkward use by Git of the terms "ours" and "theirs"not helping either).

Probably the best approach would be to rebase on specific commits, one at the time, all the way up to the current HEAD of master, instead of rebasing on the whole change-set at once. It's going to take longer (and involving many temporary branches too), but it should help to find the knots which are at the core of the conflicts, and solve them at once — once that's out of the way, the rest of the rebase should be a smooth operation.

I think that the main problem here is due to a couple of AsciiDoc sources of the manual, the rest of the files needing only to be rebased on "ours" (which, apparently, according to Git it means the other branch, i.e. the rebase from, in the rebase context, where "ours" and "theirs" are swapped compared to merge operations — such a human friendly approach! linguistics are definitely not the strength of Git, which is permeated by dubious semantic choices all over its interface and documentation; sorry for the rant).

What worries me most are the HTML and PDF builds of the various commits — if we amend the ADoc sources of the Manual (i.e. integrating changes from both "their" and "ours") they won't be mirroring the actual document status any longer. But rebuilding them on a per-commit-basis is going to require much fiddling in detached head commits (not to mention potential issues with updated dependencies in the toolchain).

I'll do my best, but it's hard to be 100% sure that no edits are lost in the process — especially since the HTML previews can no longer be relied on as a way to compare results.

@thoni56
Copy link
Contributor

thoni56 commented Sep 18, 2020

Yes, I have seen that too. I'll have a look at the changes again.

Perhaps one apporach would be to just look for changes in the PDF:s and the hand-edit the changes into master.

(I'm starting to feel that since there is a lot of (valuable) cruft around the "text", having documentation in text/mergeable form is only half-valuable... One would wish for a separation of "text" and "administrative data", like indexing and stuff. But perhaps this is mainly because or documentation is in this infant state, when, through your tremendous work, that "administrative data" is changing a lot.)

A PDF diff between master and handle-ellisions can be found here, (The draftable on-line comparison is free...)

And there are actually very few essential changes in the text. Most are, as alluded to above, index, crossreferences, page headings, ... a lot of things that are "generated" or "administrative". I think it would be easier to just re-do the "real" changes by hand.

But having said that I'm not sure in which order we should do them. Probably it would be better (=easier) to merge the beta7-prep-* into beta-7-prep and then merge that. And finally re-do the ellision edits.

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 18, 2020

Should be Fixed Now

@thoni56, I think I've unraveled all the knots. I took a better approach:

  1. Created a new _new_beta7-prep branch from master (now updated with the new EditorConfig settings and polished sources).
  2. Then, in _new_beta7-prep I cherry picked the various commits from beta7-prep, solving conflicts one-commit-at-the-time, and then amending any files that didn't pass the EditorConfig validation.

As it turned out, many commits would turn out as empty after solving conflicts, because they were already handled in master somehow, so the number of actual commits dropped down drastically.

This approach was much cleaner, and it gave me the chance to focus on every commit's significant changes, leaving out noise from the conflicts. I'm confident that nothing went lost.

So, now we basically have a clean version of beta7-prep, which I'll only need to rename from _new_beta7-prep to beta7-prep and force push it to origin to overtake the current conflict one (I've also kept a local backup of the current beta7-prep branch, in case we need to roll back).

Now I only to rebase the branches of the pending PRs on the new beta7-prep, and maybe reduce their commits numbers by squashing them, and amend source files according to the new code styles conventions. But this part should be easier.

Therefore ... I'll be soon force-pushing the dev branches and replacing them with their new versions, and push also their old backups, so we can keep them until everything is merged and fine.

What Went Wrong?

I now realize that my mistake was that I kept thinking that we'd be soon squashing beta7-prep into master and didn't think instead that we should have been able to rebase it on master from time to time.

On the contrary, I often added global features in beta7-prep, and then checked them out in master to share them with the main repo; and also merged master into beta7-prep from time to time, in order to access some updated feature from there.

Lessons Learned

From now on, I'll make sure that the base dev branch will always be rebasable on master, and only commit documentation contents changes on the dev branch, and repository configuration/feature changes always on master, and just rebase the dev branch as needed.

@thoni56
Copy link
Contributor

thoni56 commented Sep 18, 2020

That sounds really good!

I'm not sure I'll ever learn to handle merges, rebases and their conflicts in a fluent manner ;-)

Your lessons learned ring true with the feeling I had, but you put it nicely with "documentation contents changes". When I did the PDF diff I was thinking that it might be handy to have a tool that extract only the text from the Asciidoc, or PDF for that matter, and we can diff that. The will show the "documentation contents changes" that we want to focus on when looking at dev branches.

Looking forward to some light in this tunnel! Good work!

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 18, 2020

it might be handy to have a tool that extract only the text from the Asciidoc, or PDF for that matter, and we can diff that.

I've found a similar tool for pandoc markdown (written in Haskell) which parses both docs, builds an AST and then diffs the nodes; if I remember correctly it also allows some options on what to show and hide (e.g. styles, heading levels, etc.). Unfortunately I haven't found a similar tools for AsciiDoc, and I believe that it would require an AST parser to do that.

@tajmone tajmone closed this as completed Sep 19, 2020
Alan Manual Beta7 Release automation moved this from To do to Done Sep 19, 2020
@tajmone tajmone unpinned this issue Sep 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📖 Alan Manual Issues relating to "The Alan Language Manual" ⚠️ important Priority: High
Projects
No open projects
Development

No branches or pull requests

2 participants