Skip to content

Commit

Permalink
serialize srclocs in bytecode; change expander to keep srclocs
Browse files Browse the repository at this point in the history
To avoid recording absolute paths from a build environment in bytecode
files, the bytecode writer converts paths to relative form based on
`current-write-relative-directory`. For paths that cannot be made
relative in that way and that are in source locations in syntax
objects, the printer in v6.x converted those paths to strings that
drop most of the path.

The v7 expander serializes syntax objects as part of `compile` instead
of `write`, so it can't truncate paths in the traditional way. To help
out the expander, the core `write` function for compiled code now
allows `srcloc` values --- as long as the source field is a path,
string, byte string, symbol, or #f. (Constraining the source field
avoids various problems, including problems that could be created by
cyclic values.) As the core `write` for compiled code prints a path,
it truncates a source path in the traditional way.

The expander doesn't constrain source locations in syntax objects to
have path, string, etc., source values. It can serialize syntax
objects with non-path source values at `compile` time, so there's no
loss of functionality.

The end result is to fix abolute paths that were getting stored in the
bytecode for compiled packages, since that's no good for installing
packages in built form (which happens, for example, during a
distribution build).
  • Loading branch information
mflatt committed Jun 27, 2018
1 parent cda4e5b commit b13f723
Show file tree
Hide file tree
Showing 20 changed files with 373 additions and 78 deletions.
2 changes: 1 addition & 1 deletion pkgs/base/info.rkt
Expand Up @@ -12,7 +12,7 @@

(define collection 'multi)

(define version "7.0.0.5")
(define version "7.0.0.6")

(define deps `("racket-lib"
["racket" #:version ,version]))
Expand Down
5 changes: 4 additions & 1 deletion pkgs/racket-doc/scribblings/reference/exns.scrbl
Expand Up @@ -988,7 +988,10 @@ The fields of a @racket[srcloc] instance are as follows:
@item{@racket[span] --- The number of covered positions (counts from
0) or @racket[#f] (unknown).}

]}
]

See @secref["print-compiled"] for information about the treatment of
@racket[srcloc] values that are embedded in compiled code.}


@defproc[(srcloc->string [srcloc srcloc?]) (or/c string? #f)]{
Expand Down
9 changes: 6 additions & 3 deletions pkgs/racket-doc/scribblings/reference/fasl.scrbl
Expand Up @@ -26,8 +26,9 @@ output port or returning the byte string otherwise. The
@racket[s-exp->fasl].

The @racket[v] argument must be a value that could be @racket[quote]d
as a literal---that is, something for which @racket[(compile `(quote
,v))] would work and be @racket[read]able after @racket[write]. The
as a literal---that is, a value without syntax objects for which
@racket[(compile `(quote ,v))]
would work and be @racket[read]able after @racket[write]. The
byte string produced by @racket[s-exp->fasl] does not use the same
format as compiled code, however.

Expand All @@ -38,7 +39,9 @@ preserve graph structure, handle cyclic data, and encode serializable
structures. The @racket[s-exp->fasl] and @racket[fasl->s-exp]
functions consult @racket[current-write-relative-directory] and
@racket[current-load-relative-directory], respectively, in the same
way as bytecode saving and loading to store paths in relative form.
way as bytecode saving and loading to store paths in relative form,
and they similarly allow and convert constrained @racket[srcloc]
values (see @secref["print-compiled"]).

Unless @racket[keep-mutable?] is provided as true to
@racket[s-exp->fasl], then mutable values in @racket[v] are replaced
Expand Down
24 changes: 17 additions & 7 deletions pkgs/racket-doc/scribblings/reference/printer.scrbl
Expand Up @@ -555,11 +555,6 @@ assembly code for Racket, and reading such a form produces a compiled
form when the @racket[read-accept-compiled] parameter is set to
@racket[#t].

When a compiled form contains syntax object constants, they must not
be @tech{tainted} or @tech{armed}; the @litchar{#~}-marshaled form
drops source-location information and properties (see
@secref["stxprops"]) for the @tech{syntax objects}.

Compiled code parsed from @litchar{#~} is marked as non-runnable if
the current code inspector (see @racket[current-code-inspector]) is
not the original code inspector; on attempting to evaluate or reoptimize
Expand Down Expand Up @@ -593,7 +588,7 @@ identifier; those functions lead to top-level and module variables
with @tech{unreadable symbol}ic names, and the names are deterministic
as long as expansion is otherwise deterministic.

Finally, a compiled form may contain path literals. Although paths are
A compiled form may contain path literals. Although paths are
not normally printed in a way that can be read back in, path literals
can be written and read as part of compiled code. The
@racket[current-write-relative-directory] parameter is used to convert
Expand All @@ -610,6 +605,19 @@ coerced to a string that preserves only part of the path (an in effort
to make it less tied to the build-time filesystem, which can be
different than the run-time filesystem).

Finally, a compiled form may contain @racket[srcloc] structures if the
source field of the structure is a path for some system, a string, a
byte string, a symbol, or @racket[#f]. For a path value (matching the
current platform's convention), if the path cannot be recorded as a
relative path based on @racket[current-write-relative-directory], then
it is converted to a string with at most two path elements; if the
path contains more than two elements, then the string contains
@litchar{.../}, the next-to-last element, @litchar{/} and the last
element. The intent of the constraints on @racket[srcloc] values and
the conversion of the source field is to preserve some source
information but not expose or record a path that makes no sense on
a different filesystem or platform.

For internal testing purposes, when the
@as-index{@envvar{PLT_VALIDATE_LOAD}} environment variable is set, the
reader runs a validator on bytecode parsed from @litchar{#~}. The
Expand All @@ -623,4 +631,6 @@ procedure is called.
mark the loaded code as generally
unrunnable instead of rejecting at
read time references to unsafe
operations.}]
operations.}
#:changed "7.0" @elem{Allowed some @racket[srcloc] values
embedded in compiled code.}]
26 changes: 19 additions & 7 deletions pkgs/racket-test-core/tests/racket/fasl.rktl
Expand Up @@ -27,11 +27,12 @@
43/100
44+100i
45.0+100.0i
46f0))
46f0
(srcloc "x" 1 2 3 4)))

;; The fasl format is meant to be forward-compatible:
(define immutables-regression-bstr
#"racket/fasl:\0\200\371\0\34\"n\4\3\6\ao\r2\16\5three\23\4four\25\4five\21\3six\"u \3vwx\36yz\35\2{|\16\afifteen%\1\2\200\16\bnineteen\16\asixteen\177%\0\2\202\23\ntwenty-one\204\23\ftwenty-three%\2\2\206\207\210\211#\16\ftwenty-eight\3\213\214\23\00231\b\340\b\200\344\f\b\201\320\204\0\0\b\2010W\5\0\b\201\200\3566\0\b\201\200\300\r\26\b\202\0\374\371\330\b\0\0\0\b\202\0`v\363\263b\1\0\b\202\0\0\220\235\316\332\2027\t\0\0\0\0\0\200D@\t\315\314\314\314\314\314\20@\v\231\322\f\232\322\f\t\0\0\0\0\0\200F@\t\0\0\0\0\0\0Y@\n\0\08B")
#"racket/fasl:\0\200\n\1\34#n\4\3\6\ao\r2\16\5three\23\4four\25\4five\21\3six\"u \3vwx\36yz\35\2{|\16\afifteen%\1\2\200\16\bnineteen\16\asixteen\177%\0\2\202\23\ntwenty-one\204\23\ftwenty-three%\2\2\206\207\210\211#\16\ftwenty-eight\3\213\214\23\00231\b\340\b\200\344\f\b\201\320\204\0\0\b\2010W\5\0\b\201\200\3566\0\b\201\200\300\r\26\b\202\0\374\371\330\b\0\0\0\b\202\0`v\363\263b\1\0\b\202\0\0\220\235\316\332\2027\t\0\0\0\0\0\200D@\t\315\314\314\314\314\314\20@\v\231\322\f\232\322\f\t\0\0\0\0\0\200F@\t\0\0\0\0\0\0Y@\n\0\08B\34\6\16\6srcloc\23\1xopqr")

(for ([i (in-list immutables)])
(test i fasl->s-exp (s-exp->fasl i)))
Expand Down Expand Up @@ -89,16 +90,27 @@
(let ([unix-path (bytes->path #"here" 'unix)]
[windows-path (bytes->path #"there" 'windows)])
(test unix-path fasl->s-exp (s-exp->fasl unix-path))
(test windows-path fasl->s-exp (s-exp->fasl windows-path))))
(test windows-path fasl->s-exp (s-exp->fasl windows-path))
(if (eq? (system-path-convention-type) 'unix)
(test (srcloc "here" 1 2 3 4) fasl->s-exp (s-exp->fasl (srcloc unix-path 1 2 3 4)))
(test (srcloc "there" 1 2 3 4) fasl->s-exp (s-exp->fasl (srcloc windows-path 1 2 3 4))))
(let ([root (car (filesystem-root-list))])
(test (srcloc (path->string root) 1 2 3 4) fasl->s-exp (s-exp->fasl (srcloc root 1 2 3 4)))
(test (srcloc (path->string (build-path root "x")) 1 2 3 4) fasl->s-exp (s-exp->fasl (srcloc (build-path root "x") 1 2 3 4))))
(test (srcloc ".../a/b" 1 2 3 4) fasl->s-exp (s-exp->fasl (srcloc (build-path (current-directory) "a" "b") 1 2 3 4)))))

(let* ([rel-p (build-path "nested" "data.rktd")]
[p (build-path (current-directory) rel-p)])
(define bstr
(define-values (bstr srcloc-bstr)
(parameterize ([current-write-relative-directory (current-directory)])
(s-exp->fasl p)))
(values
(s-exp->fasl p)
(s-exp->fasl (srcloc p 10 20 30 40)))))
(parameterize ([current-load-relative-directory #f])
(test rel-p fasl->s-exp bstr))
(test rel-p fasl->s-exp bstr)
(test (srcloc rel-p 10 20 30 40) fasl->s-exp srcloc-bstr))
(parameterize ([current-load-relative-directory (current-directory)])
(test p fasl->s-exp bstr)))
(test p fasl->s-exp bstr)
(test (srcloc p 10 20 30 40) fasl->s-exp srcloc-bstr)))

(report-errs)
29 changes: 29 additions & 0 deletions pkgs/racket-test-core/tests/racket/print.rktl
Expand Up @@ -339,6 +339,35 @@
(write (s) (p o))
(test "ok" get-output-string o))

;; ----------------------------------------
;; Check that some values are allowed in a srcloc source
;; in printed compiled code, and some values are not

(let ()
(define (try v [result-v v] #:ok? [ok? #t])
(define-values (i o) (make-pipe))
(define c (compile `,(srcloc v 1 2 3 4)))
(cond
[ok?
(write c o)
(test result-v
srcloc-source
(parameterize ([current-load-relative-directory (build-path (current-directory) "sub")])
(eval (parameterize ([read-accept-compiled #t])
(read i)))))]
[else
(err/rt-test (write c o) (lambda (exn) (and (exn:fail? exn)
(regexp-match? #rx"cannot marshal" (exn-message exn)))))]))

(try #f)
(try 'apple)
(try "apple")
(try #"apple")
(try (string->path "apple") "apple")

(try 7 #:ok? #f)
(try (box 7) #:ok? #f))

;; ----------------------------------------

(report-errs)
19 changes: 18 additions & 1 deletion pkgs/racket-test-core/tests/racket/stx.rktl
Expand Up @@ -2600,6 +2600,23 @@
#rx"key for a perserved property must be an interned symbol"
(exn-message exn))))

;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Make sure that paths from the current installation are not
;; preserved in marshaled bytecode

(let ([m '(module m racket/base
;; Extending a primitive structure type tends to
;; capture an identifier whose source is "kernstruct.rkt"
(define-struct (cookie-error exn:fail) ()))])
(define o (open-output-bytes))
(write (compile m) o)
(call-with-output-file "/tmp/d" #:exists 'replace (lambda (o) (write (compile m) o)))
(test #t
not
(regexp-match? (regexp-quote
(path->bytes (collection-file-path "kernstruct.rkt" "racket/private")))
(get-output-bytes o))))

;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Make sure the srcloc encoding doesn't do something strange
;; with a path in a root directory:
Expand All @@ -2615,7 +2632,7 @@
(write (compile (read-syntax path p)) out)
(eval (read in))
(define src (syntax-source ((dynamic-require path 'f))))
(test path values src)))
(test (path->string path) values src)))

;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Expand Down
62 changes: 40 additions & 22 deletions pkgs/zo-lib/compiler/zo-marshal.rkt
Expand Up @@ -11,7 +11,8 @@
racket/pretty
racket/path
racket/set
racket/extflonum)
racket/extflonum
racket/private/truncate-path)

(provide/contract
[zo-marshal ((or/c linkl-directory? linkl-bundle?) . -> . bytes?)]
Expand Down Expand Up @@ -329,9 +330,10 @@
CPT_SET_BANG
CPT_VARREF
CPT_APPLY_VALUES
CPT_OTHER_FORM)
CPT_OTHER_FORM
CPT_SRCLOC)

(define CPT_SMALL_NUMBER_START 46)
(define CPT_SMALL_NUMBER_START 47)
(define CPT_SMALL_NUMBER_END 74)

(define CPT_SMALL_SYMBOL_START 74)
Expand Down Expand Up @@ -745,26 +747,8 @@
(out-anything qv out))]
[(? path?)
(out-byte CPT_PATH out)
(define (within? p)
(and (relative-path? p)
(let loop ([p p])
(define-values (base name dir?) (split-path p))
(and (not (eq? name 'up))
(not (eq? name 'same))
(or (not (path? base))
(loop base))))))
(define maybe-rel
(and (current-write-relative-directory)
(let ([dir (current-write-relative-directory)])
(and (or (not dir)
(within? (find-relative-path v
(if (pair? dir)
(cdr dir)
dir))))
(find-relative-path v
(if (pair? dir)
(car dir)
dir))))))
(path->relative-path v))
(cond
[(not maybe-rel)
(define bstr (path->bytes v))
Expand All @@ -777,6 +761,19 @@
(path-element->bytes e)
e))
out)])]
[(? srcloc?)
(out-byte CPT_SRCLOC out)
(define src (srcloc-source v))
(define new-src
(cond
[(and (path? src) (not (path->relative-path src)))
(truncate-path src)]
[else src]))
(out-anything new-src out)
(out-anything (srcloc-line v) out)
(out-anything (srcloc-column v) out)
(out-anything (srcloc-position v) out)
(out-anything (srcloc-span v) out)]
[(or (? regexp?)
(? byte-regexp?)
(? number?)
Expand Down Expand Up @@ -973,3 +970,24 @@
[(struct-other-shape? constantness)
(to-sym 5)]
[else #f]))

(define (path->relative-path v)
(define (within? p)
(and (relative-path? p)
(let loop ([p p])
(define-values (base name dir?) (split-path p))
(and (not (eq? name 'up))
(not (eq? name 'same))
(or (not (path? base))
(loop base))))))
(and (current-write-relative-directory)
(let ([dir (current-write-relative-directory)])
(and (or (not dir)
(within? (find-relative-path v
(if (pair? dir)
(cdr dir)
dir))))
(find-relative-path v
(if (pair? dir)
(car dir)
dir))))))
9 changes: 8 additions & 1 deletion pkgs/zo-lib/compiler/zo-parse.rkt
Expand Up @@ -294,7 +294,8 @@
[43 varref]
[44 apply-values]
[45 other-form]
[46 74 small-number]
[46 srcloc]
[47 74 small-number]
[74 92 small-symbol]
[92 ,(+ 92 small-list-max) small-proper-list]
[,(+ 92 small-list-max) 192 small-list]
Expand Down Expand Up @@ -460,6 +461,12 @@
(build-path p (if (bytes? e) (bytes->path-element e) e))))
;; Read a path:
(bytes->path (read-compact-bytes cp len))))]
[(srcloc)
(srcloc (read-compact cp)
(read-compact cp)
(read-compact cp)
(read-compact cp)
(read-compact cp))]
[(small-number)
(let ([l (- ch cpt-start)])
l)]
Expand Down

1 comment on commit b13f723

@racket-discourse-github-bot

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit has been mentioned on Racket Discourse. There might be relevant details there:

https://racket.discourse.group/t/was-there-a-bug-embedding-srclocs-in-compiled-code-in-bc-8-5-that-was-resolved-in-roughly-8-9/2647/1

Please sign in to comment.