Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

use a different separator than page break (^L) #167

Closed
whit537 opened this Issue · 76 comments
@whit537
Owner

Pretty much since the beginning people have been saying that ASCII page breaks are too weird. We relaxed the constraint to caret-ell as a result. @dcrosta rewrote the framework more or less to get a different separator. Whatever it is in Ruby that does something similar uses something different. The latest thready discussion is leaning the same way:

http://www.reddit.com/r/Python/comments/1b3t09/why_gittip_use_aspen_instead_of_djangorails/

So let's maybe do that. How about the separator is \n----+(.*)\n, where the group is used as the specline (specifying which syntax the following page is).

@meatballhat

Why not make the page break configurable? No matter what you choose, you'll have to think about how to let people escape it, so why choose at all? :smile_cat:

@whit537
Owner

@meatballhat Escaping it isn't the driving issue, it's "weirdness." We can escape with \, I think.

---- would start a new page.
\---- would render as ----.
\\---- would render as \----.
\\\---- would render as \\----.
\\\\---- would render as \\\----.
etc.

No?

@whit537
Owner

And I think this is a place for convention over configuration. I think ---- will be universally accessible enough that once we go that route we won't get demand for additional alternatives.

@meatballhat

But what kind of favicon is ---- ???

in seriousness, :thumbsup:

Would this change make it any easier for folks to write syntax highlighting rules for various editors? Given, I really only care about vim, but figured I'd bring it up.

@whit537
Owner

I think the explicit file extension is more important for tool support than the page delimiter. Right now the algorithm for deciding if a file is a simplate is rather complicated. It's too much to ask tool authors to reproduce that algorithm in N tools. Much easier to look for .spt (and .sock?).

@meatballhat

Being a multi-glot framework, is a single file extension enough? :wink:

.spt.{py,js,rb,go} is probably getting too silly, though...

@whit537
Owner

For the foreseeable future we're going to have a different process for each language. If/when we ever have one process handling multiple languages, we can use hashbangs to specify the language per file.

@meatballhat

I was thinking about the editor standpoint. How does one write a syntax rules to cover *.spt files with 4 (for now) potential languages in the first 1+ pages? This can most certainly get punted for this issue -- just thinking about the future :smile_cat:

@whit537
Owner

Yeah, we'd have to parse hashbangs. That sounds a lot easier to me than sniffing for page breaks in potentially binary files.

@sigmavirus24

I am so -1 on this I'm probably -10. The purpose of using a framework is to have it so that someone can move from one project in it to another without having to concern themselves with the trivialities of the other project author's mental hobgoblins. Having all aspen projects use ^L or perhaps another page separator is fine with me but please don't make it configurable. Someone sufficiently insane could use a custom page separator FOR EACH FILE if you declare it in the hashbang. That's just insanity, and you might say "Oh but Ian, no one would ever do that", but you'd be wrong. :-)

Please please please don't do this.

@meatballhat

@sigmavirus24 The shebangs are all about syntax highlighting, not configuration of page breaks, IIUC. Configurability of the page separator for an entire site (everything under the master process or whatever) is desirable IMO, but I can certainly see your point. I tend to prefer frameworks that let me break myself rather than trying to anticipate all of my problems. If the (globally!) configurable page break were left undocumented or the documentation made me feel bad about changing it, that'd be fine by me.

@pjz
Owner

I don't care what the separator pattern is, but there should be one and only one (to facilitate toolage), and it should start and end with a newline (so we can put declarative data there). It being configurable-but-advisedly-not is, I guess, fine too.

@sigmavirus24

@meatballhat Sorry, I read that notification email before having sufficient coffee funds deposited. But yeah, I think aspen is young enough that it could be properly changed, I just feel dirty letting someone configure that. I can see, however, how ^L might be difficult for international users who may not have an L character though. Perhaps making it a "hidden" (but documented) feature would be a good way of doing it.

@whit537
Owner

Okay, decided:

  • New separator is \n----+(.*)\n
  • Escape character is \.
  • We'll implement in such a way that it can be changed if desired, but we'll leave this undocumented for now. What this will look like is overriding something like an aspen.resource.SPLITTER regex or an aspen.resource.split_pages function depending on how we implement.

Now, who wants to implement? :)

@pjz
Owner

What's the escape character for?

@whit537
Owner

The case where you want to send a literal \n----\n to the client. No?

@Lucretiel

I'll go for it. What version of python are we supporting?

@pjz
Owner

so \n----\n gets ignored as a simplate divider and instead is a literal \n----\n in whatever section it's in? Okay, sounds reasonable.

@whit537
Owner

@Lucretiel Whoa, awesome! Thanks! :D

Right now we are Python 2.5 - 2.7. Do you IRC? We're in #aspen on Freenode.

@lyndsysimon

:-1:

I preferred the \n(.*)^L(.*)\n proposal myself.

Preferences aside, this sounds like a breaking change. Are you going to have a transitionary period when either syntax is allowed, or are you going to require that everyone either change their code or override the new syntax?

If it's the latter, how about a simple config option like OLD_STYLE_PAGEBREAKS in configure-aspen.py to ease things a bit?

@whit537
Owner

@lyndsysimon Since we're still pre-1.0, I think we should not encumber ourselves with backwards compatibility here. We'll bump from 0.22.x to 0.23.x, and we'll have to update our simplates for projects already using Aspen.

@whit537
Owner
@clone1018

Guess this is useless now that we're already making the change, but I'm giving this my -1.

But hey, now that we're changing this, let's also remove the entire concept of file based routing, one big routes file would be good!

RIP ^L

@whit537
Owner

/me observes moment of silence :disappointed:

@whit537
Owner

Also, I think if there is only one page (zero page breaks) then we shouldn't do escaping.

When I post a hax0r.txt with:

\\\\\\\\\\\\--------------------------------

kr3w!!!!!!!


\\\\\\\\\\\\--------------------------------

.gov hax0r3d!!!!!!!!

And it's a plain txt file, not a simplate, then we should just pass it through unaltered.

@joelmccracken

Not that it matters, really, but a bunch of emacs lispers put pagebreaks in their Elisp files. Its a common thing in the emacs community.

@pjz
Owner

So what's the backward compatible setting for aspen.resource.SPLITTER ? r'\n^L(.*)\n' ?

@Lucretiel

As I'm writing it now, just '^L' should work. My code automatically suffixes and prefixes as necessary. If you want to have it anywhere in the line, use '.*?^L'.

@Lucretiel Lucretiel referenced this issue from a commit
Nathan West Changed page delimiter to ----+
See issue #167
3fd91f2
@ianb

Kind of an aside, but many editors search for special comments to determine mode. In Emacs for instance -*- mode: aspen-python -*- or somesuch. It's kind of a terrible separator, but to the degree the separator is unique it might be possible to search a file for that separator on opening and change the editor mode. ---- isn't unique enough for that though.

@whit537
Owner

@joelmccracken Yeah, I picked it up from an Emacs hacker in the first place (@warsaw). You can still use it in simplates if you want to, right? It'll be ignored as whitespace by most(?) syntaxes.

@ianb Yeah, for Aspen we'll end up with a "specline" after the ---- where you can set renderer and content type.

import foo
---- #!pystache text/plain
I'm pretty sure this is how {{ foo }} works.
@whit537
Owner

Also, I mentioned it in IRC but let's record it here too. The ---- can be variable length, because depending on the size of the simplate it's nice to have that longer or shorter. A small simplate like in the previous comment just needs ----, but for a complex simplate (e.g., stats in Gittip), it'll be nice to have a more significant break.

@whit537
Owner

@ianb Though you could have -*- mode: aspen-python -*- at the top of your file to trigger a simplate multimode, right?

@ianb

@whit537: sure, but I hate such editor-specific slugs, so if the file was unambiguous anyway it might not be necessary.

@whit537
Owner

@ianb Yeah, switching to an explicit .spt file extension (#148) should address that. Wanna pick that one up? :]

@pjz
Owner

variable-length --'s are more likely to match elsewhere in the file. Any suggestions that are less likely to actually be found in, eg. Markdown ? (which is something someone could totally use as a templating language)

/-\5P3/\/ ? /\/\/\/\/\ ? _-=-_-=-_-=-_ ? \n<<-----(.*)----->>\n ?

@whit537
Owner

@pjz But won't escaping still catch them?

@whit537
Owner

I wanna just mash dash as much as I need. :)

@pjz
Owner

sure, but that makes the developer escape every single case of multiple ---'s at the beginning of a line, which is all over the place in Markdown.

@pjz
Owner

I'm really liking ^L more and more :)

@ianb

I never write ^^^^ for other reasons. But ^_^ is perhaps better.

@clone1018

+1 for ^L

@ArmstrongJ

Very sad to see ^L go, partially for nostalgia, but mostly because I had thought it a) worked fine and b) was more noticeable than four dashes. Additionally, I'm not looking forward to going back through my simplates to change them...

+1 for ^L

@Lucretiel

Guys guys guys don't forget that ---- is only treated as a page separator at the BEGINNING of a line. Anywhere else and its ignored by the parser. Also keep in mind that it must have a minimal length of 4 dashes.

@Lucretiel

I'd like to put in- again- for "grab first line" as the delimiter. Similar to regex / anchors or cat <<EOF

@whit537
Owner

I'm seeing five issues:

  1. @ArmstrongJ and @clone1018 still like ^L. (@pjz is just trolling us for bike-shedding. ;) )
  2. @Lucretiel is advocating for specifing the delimiter in the first line of the file.
  3. @Lucretiel is advocating in IRC for multiple files instead of multiple pages (this came up on the Reddit thread as well)
  4. @pjz raises a valid point about ---- conflicting with text formats such as Markdown and reST.
  5. We need to account for the specline.
  6. We need to make tooling easy.

^L

Both times that Aspen has gotten any conversation going around it, the page break has been a big stumbling-block. Check out this HN thread from two years ago, and then compare it with yesterday's Reddit thread. People are really hung up on ^L. @dcrosta hated it so much that he wrote Keystone.

It's time to bite the bullet and move on.

Furthermore, I think Aspen is still early enough in its growth curve (despite its age) that we don't want to encumber the codebase with full-blown backwards-compatibility. We're still pre-1.0. If we can come up with a solution that allows for undocumented hacking of the delimiter in a way that lets @ArmstrongJ and others continue using Aspen with their old-skool simplates, I'm all for it. I'm thinking of a code snippet you can drop in configure-aspen.py.

First Line as Delimiter

I'm -1 on this, because I don't want to make the delimiter officially configurable at all. As mentioned before, I see the page delimiter as a clear place for convention over configuration. Had I chosen a better delimiter in the first place we would never be debating this now. We need "one right way" to do this, so that we have a stable foundation for the tooling ecosystem we need to build next. ;)

Multiple Files

This is too big a change to be in scope here. We can discuss that on a new ticket if desired but can't let that bog us down here.

Conflicts with Markdown, etc.

This is a valid point. The two places in Markdown where ---- is used are h2s and hrs. Both also support alternatives (## Foo for h2, and **** or - - for hr). The case is similar in reStructuredText.

One possibility is to proceed as planned (----+), and require people to escape any ---- in Markdown or use the alternatives.

Another possibility is to modify the delimiter slightly so as not to conflict. We would still provide escaping.

Specline

Content pages in Aspen can have two parameters set in the so-called "specline," which comes after the delimiter and before the newline. Right now it looks like this:

import foo
^L #!pystache text/place
{{ foo }}

That tells Aspen to use the pystache "renderer," and to serve the result as Content-Type: text/plain.

We don't need arbitrary key:value headers in the specline; those are the only two parameters we need to account for. We do need to account for them, though.

Tooling

The solution we come up with needs to be easy to build tooling on top of. Primarily this means text editor plugins for .spt files that will use different modes for different parts of the page. I don't know how well multimodes are supported in various editors. My pessimistic expectation is that it's kind of kludgy and we'll have to fix editors to really make Aspen plugins work perfectly.

Proposal

Okay, so here's what I propose:

import foo
[-----------------------------]
bar = foo.bar()
[-----------------------------] text/plain via pystache
{{ bar }}

That's [----] for the basic delimiter.

  • It can have four or more dashes.
  • It uses brackets to avoid conflicting with Markdown, etc.
  • It uses brackets because they're near - and don't require a modifier key to type.
  • It uses brackets on both ends for arbitrary aesthetic reasons.
  • The specline is outside the [----] so it's easier to parse.
  • The specline drops the hashbang because hashbangs mean something else.
  • The specline is content_type via renderer.
  • Both content_type and via renderer are optional in the specline.
  • If content_type is omitted then it's [----] via renderer.
  • This can still be escaped as \[----]

I don't know enough about writing editor plugins to say whether this makes it easy or not. Does anyone have experience here? I may revisit aspen-vim and/or aspen-emacs to spike this out a bit.

@Lucretiel

For those curious, here's my progress on the ------- (no []) style delimiters https://github.com/Lucretiel/aspen-python/tree/issue167

@meatballhat

:thumbsup: to the closing bulleted list in @whit537's latest comment. I also don't know enough about writing editor plugins, but my one experience creating some syntax highlighting for a proprietary template language in vim tells me it's probably not completely bonkers.

@Lucretiel
@warsaw
@whit537
Owner

Thanks for chiming in, @warsaw. Love the history! :D

@Lucretiel Re: question in IRC ... padding is so we get accurate line numbers in tracebacks. We haven't had tests for that but we should.

@Lucretiel Lucretiel referenced this issue from a commit
@Lucretiel Lucretiel Initial implementation of issue #167
-Untested
-Other source files have not been updated
151de99
@whit537
Owner

@pjz You up to review and merge @Lucretiel's work?

@whit537
Owner

@Lucretiel I gave your branch a once-over and it looks good on the surface. Is this ready for a full review? What are you seeing as next steps?

@Lucretiel

Yeah I'd say it's good to go at this point. The only thing still missing finishing updating some of the negotiated_template tests

@pjz
Owner

Looks good, so I merged it; re-open this if there are issues with it.

@pjz pjz closed this
@whit537
Owner

Damn. Good honkin' work, everyone! Big thanks to @Lucretiel for implementing this and to @pjz for reviewing/merging.

The tests are passing for me and I'm seeing [-----------------] in the simplates in docs/. Did some poking at speclines and that seems to be working as well.

We need to update the docs, themselves, though. We're still talking about ^L. There's a make doc target to run the docs locally from a clone.

@whit537 whit537 reopened this
@whit537
Owner

Escaping doesn't work.

@Lucretiel
@whit537 whit537 referenced this issue from a commit
@whit537 whit537 Prune "Page Break" file from docs; #167
Let's fold this into the main page re: Simplates.
239f13f
@whit537
Owner

@Lucretiel Am I reading that wrong?

@whit537 whit537 referenced this issue from a commit
@whit537 whit537 Update ^Ls in docs; #167 5bf360a
@whit537
Owner

I noticed this while updating the docs.

@whit537
Owner

@Lucretiel Is this something you think you'll be able to take a look at? I leave for NYC early tomorrow. Would love to have this released and implemented on Gittip before tomorrow night's talk at NYCPython.

@Lucretiel
@Lucretiel

Bug found. My code, and all my tests, use / as the escape character, not \. Fixing.

@pjz
Owner

So the issue here is that @Lucretiel wrote escaping with forward slashes and @whit537 tried to escape using backslashes. The docs in this issue say to implement using backslashes. My bad for not catching that in code review. @Lucretiel think you can fix it and shoot me a pull request?

@whit537 whit537 referenced this issue from a commit
@whit537 whit537 Fix regression in resources; #167
The order of imports was important in resources/__init__.py. I addressed
this by moving the pagination methods to a separate module.
5ca9899
@whit537
Owner

Phew. :)

@whit537
Owner

@Lucretiel @pjz I'll stop with the changes to master for now while we fix this.

@whit537
Owner

I'm going to start looking at #148 but I'll lay off master.

@whit537
Owner

@Lucretiel Also, may I add you as a collaborator on the Gittip org with perms on this repo (aspen-python)? You do good work. :)

@Lucretiel
@whit537
Owner

@Lucretiel Done. Welcome aboard. :)

P.S. You should consider joining Gittip when you get a chance:

https://www.gittip.com/on/github/Lucretiel/

@whit537 whit537 referenced this issue from a commit
@whit537 whit537 Add forgetten file; #167 8ff8f72
@pjz
Owner

bugs fixed as of ee371e4. Still need docco update.

@pjz
Owner

Actually looks like @whit537 fixed the docs yesterday. Closing this unless/until other bugs found.

@pjz pjz closed this
@Lucretiel

Changed to be a raw string.
Regarding @whit537's regression commit: I originally had that stuff in a different module, but I didn't want to disrupt the model of all the modules in resources being resource types.

@Lucretiel

Reverted

@pjz
Owner

Ugh, I owe you an apology. Turns out that while r'' keeps the python interpreter from turning \n into a newline, the regex interpreter will then do it. So it was working after all. either way works; escape hell is still escape hell :)

@whit537 whit537 referenced this issue from a commit
@whit537 whit537 Fix some seps that shouldn't be escaped; #167
For some examples we have <pre>[----] and we don't need to escape the
page separator in that case because it's not at the beginning of the
line.
4861224
@whit537 whit537 referenced this issue from a commit
@whit537 whit537 Relax [----] to [---]; #167
It's easier to type one three instead of two twos.
a340aae
@whit537
Owner

@pjz sez we're not allowed to change this again for at least 6 months. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.