-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative Imports #473
Alternative Imports #473
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to change the original grammar. What you have is equivalent to the grammar even if implemented in a different way
src/Dhall/Core.hs
Outdated
@@ -473,6 +478,7 @@ instance Monad (Expr s) where | |||
Field a b >>= k = Field (a >>= k) b | |||
Project a b >>= k = Project (a >>= k) b | |||
Note a b >>= k = Note a (b >>= k) | |||
ImportAlt a _b _c >>= k = k a -- TODO: does this make sense? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part I'm not sure about is that the contained expressions (_b
, _c
) will not bind to k
(as it happens with the other operators)
src/Dhall/Core.hs
Outdated
@@ -1889,6 +1912,7 @@ isNormalized e = case denote e of | |||
else True | |||
_ -> True | |||
Note _ e' -> isNormalized e' | |||
ImportAlt _ _ _ -> True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be:
ImportAlt _ b c -> isNormalized b && isNormalized c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, missed that, thanks 👍
src/Dhall/Parser/Expression.hs
Outdated
importAltExpression embedded = | ||
noted (do | ||
l <- orExpression embedded | ||
a <- embedded -- FIXME: broken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just do:
let a = Missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite, since we need to keep the parsers polymorphic (as it gets concrete only in exprA = expr import_
).
I think the problem here is that the constructor is ImportAlt a (Expr s a) (Expr s a)
, and I'm not sure how to get an a
from the embedded
without running it (which we cannot, as we have to continue the parsing of the current tree), so that I can run the fold
two lines later with (ImportAlt a)
. That's why I'm concerned that the constructor might not be right.
@f-f: Another option is to remove the first field from Then what you could do is:
|
@Gabriel439: yeah this approach is much simpler, I have an almost-working implementation up in the latest commit. ❯ .stack-work/install/x86_64-osx/lts-11.12/8.2.2/bin/dhall <<< "1 ? 2"
Natural
1 However in case an import fails to resolve, the exception is thrown before the ❯ .stack-work/install/x86_64-osx/lts-11.12/8.2.2/bin/dhall <<< "1 ? missing"
↳ missing
Error: Missing keyword found I'd like to catch it here, but I just realized that branch is never called and I should instead catch it and remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@f-f: The second catch
you linked to here:
https://github.com/dhall-lang/dhall-haskell/pull/473/files#diff-af3f1fd4c90ee30ad5bc7e05d5bbb9d0R707
... rethrows the exception, so it shouldn't be interfering with the outward propagation of the exception
src/Dhall/Import.hs
Outdated
loadStaticWith from_import ctx n (ImportAlt a b) = do | ||
resolveExpr a `catch` handler | ||
where | ||
resolveExpr expr = fmap join (traverse process expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplify this to:
resolveExpr = loadStaticWith from_import ctx n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, thanks!
However, that new branch that pattern-matches on ImportAlt
never gets called, because the last one is called on the complete expression, and that traverse
s the tree calling always the first branch on every Import
.
This happens because the process
function is something like Import -> IO (Expr Src X)
, and because the (auto-derived) implementation of traverse
for the ImportAlt
constructor is:
traverse f (ImportAlt a b) = ImportAlt <$> traverse f a <*> traverse f b
Since instead of keeping the structure here we would need to get rid of this constructor and keep the first branch that doesn't fail, we should have instead:
traverse f (ImportAlt a b) =
catch (traverse f a)
(\e -> let _ = (e :: SomeException) in traverse f b)
So basically I think that the simplest way to implement this is to replace that call to traverse
with a call to a mapM'
(maybe called importsMapM
) that will reimplement traverse in all other cases except for ImportAlt
.
(I'll probably pull in uniplate
to do this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@f-f: Have you tested the change I suggested? The reason I suggested it was because it should resolve the problem you are describing. By calling loadStaticWith
on the two sub-expressions it should correctly try the branches for Embed
/ImportAlt
before doing the fall-back case for all other constructors. I think the reason your code has the problem was because you inlined the logic for the last branch for all other constructors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, never mind. Now I understand the problem you are describing. The issue is an expression like this:
App (ImportAlt a b) c
Once it pattern matches on App
and doesn't find ImportAlt
or Embed
then it skips straight down to the Embed
constructors at the leaves of the tree
I think the correct thing to do here is to just spell out the other cases instead of relying on traverse
+ join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of which, I should probably simplify the AST a little bit to group operators and built-ins under one constructor each so that functions like these are easier to write
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, +1 on simplifying the AST with groupings
I've just realised I've got a query about the semantics here. Which sorts of failures allow the fallback imports, and which should bubble up? Suppose I've got an import like So this suggests one more thing to note down in the spec (whether all errors should be candidates for fallback, or only, e.g., HTTP errors), and also that it would be nice to have another mode of operation that will unmask all masked errors (i.e., chase all import branches and collect errors from all of them, and also make sure that the ones that don't fail have the same type). |
@quasicomputational: Good point. I think the kind of errors that should trigger the fallback import are only if the import is missing (i.e. the file does not exist, the URL is a 404, or the environment variable is unset) |
We want to be more aggressive in falling back for HTTP errors than just 404s, since a primary use-case here is to swap out failing services for mirrors and failing services can break in a variety of ways. Probably anything except an (eventual, via redirects) 200 should fall back. |
@quasicomputational I also have a concern about import hashes: e.g. should two alternative expressions have the same hash? If not, does it mean that when an import fails, we might fetch something else with a different hash? (My take is that we should not enforce this "consistent hashing" on expressions unless the user requests it explicitly by putting an hash for the whole expression) |
@f-f: I don't think we should require the same hash or the same value for alternatives. A simple example for why you would the two alternatives to differ would be optionally retrieving an environment variable: [ env:HOME as Text ] : Optional Text ? [] : Optional Text |
@quasicomputational: Yeah, I agree that fallback on non-200 response code should be fine |
src/Dhall/Import.hs
Outdated
for @Expr@, with the exception for the @ImportAlt@ constructor, that | ||
gets eliminated while we search for imports that fail to resolve. | ||
-} | ||
importMapM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function seems specialized to loadStaticWith
, I'd suggest inlining it in loadStaticWith
, like this:
loadStaticWith from_import ctx n expr = case expr of
ImportAlt a b -> ...
Const a -> ...
Var a -> ...
@Gabriel439 so I reimplemented So, now we can have alternative expressions: ❯ ./dhall <<< "1 ? missing"
Natural
1 And the first expression to resolve is returned: ❯ ./dhall <<< "missing ? 1 ? 2"
Natural
1 However, this behaviour was a bit of a surprise: ❯ ./dhall <<< "env:UNSET_VARIABLE as Text ? missing"
↳ missing
Error: Missing keyword found Basically if all the expressions fail to resolve, only the last error is returned. |
@f-f: The user experience should probably be to return all the errors if every branch fails, but with one caveat: any The way I would do this is to wrap any exception related to imports being missing into a new exception storing a list of exceptions. In other words: newtype MissingImports = MissingImports [SomeException] ... and then the loadStaticWith from_import ctx n (ImportAlt a b) =
loadStaticWith from_import ctx n a `catch` handler₀
where
handler₀ (MissingImports es₀) =
loadStaticWith from_import ctx n b `catch` handler₁
where
handler₁ (MissingImports es₁) =
throwIO (MissingImports (es₀ ++ es₁)) ... and then instead of: throwIO MissingKeyword ... you do: throwIO (MissingImports []) ... and then the pretty-printing logic for |
Another thing I realized is that we should specify that if a transitive import fails that should count as the parent import failing |
@Gabriel439
Examples of the current behaviour: ❯ ./dhall <<< "env:UNSET1 as Text ? env:UNSET2 ? missing ? env:UNSET3"
Error: Failed to resolve imports. Error list:
↳ env:UNSET1as Text
Error: Missing environment variable
↳ UNSET1
↳ env:UNSET2
Error: Missing environment variable
↳ UNSET2
↳ env:UNSET3
Error: Missing environment variable
↳ UNSET3
❯ ./dhall <<< "missing ? env:UNSET"
↳ env:UNSET
Error: Missing environment variable
↳ UNSET
However, I don't quite like this, because it doesn't display anymore the location of the error (I think the problem is here): ❯ ./dhall <<< "missing"
Error: No valid imports
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great to merge as is. I had just had some minor suggestions
src/Dhall/Import.hs
Outdated
show (MissingImport e) = show e | ||
|
||
-- | List of Exceptions we encounter while resolving Import Alternatives | ||
newtype MissingImports = MissingImports [MissingImport] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a simpler way to do this is to define MissingImports
as:
newtype MissingImports = MissingImports [SomeException]
... and then replace all occurrences of MissingImport
with Control.Exception.toException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, I'll try that 👍
src/Dhall/Import.hs
Outdated
|
||
-- | List of Exceptions we encounter while resolving Import Alternatives | ||
newtype MissingImports = MissingImports [MissingImport] | ||
deriving (Typeable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to derive Typeable
any longer. I recently learned that it's automatically derived for all Haskell types without having to specify this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. IIRC in our Haskell projects at work we have to derive it, but it probably depends on GHC version and/or extensions, I'll read up on this.
src/Dhall/Import.hs
Outdated
else do | ||
m <- zoom cache State.get | ||
case Map.lookup here m of | ||
Just expr -> return expr | ||
Nothing -> do | ||
let handler | ||
:: MonadCatch m | ||
let handler₀ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to change the logic for this handler? My understanding is that this mainly affects how errors are rendered. In other words, if we didn't change the handler the error might look like:
↳ ./foo
↳ Failed to resolve imports. Error list:
↳ ./bar
↳ ./baz
... whereas your change causes it to render like this:
Failed to resolve imports. Error list:
↳ ./foo
↳ ./bar
↳ ./foo
↳ ./baz
... but maybe the original way might be more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is necessary (with this implementation atleast):
The reason why we need to match & unwrap the MissingImports
here is that if we don't we will have things like MissingImports (Imported ... (MissingImports []))
when we parse a missing
.
This means that we will render an additional "No imports found" in place of the missing
keyword, while we wanted to avoid that.
And we do need to wrap every import error in a MissingImports
, so that the ImportAlt
case works (so we have to do the unwrapping either here or there)
(I think my explanation might not be that clear, so I can re-explain more extensively if it doesn't make sense)
Not sure if there is a nicer way to do this whole exception piping 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Yeah, in this case keep what you have
Awesome, thanks! I'd like to add some tests before we merge. |
Alright! Sounds good |
Also, in terms of how to do this more nicely ... One way to think about this is in terms of the following algebraic interpretation of import resolution:
Anyway, the idea is that we can use the usual algebraic rules to reason about how imports work:
The other benefit of thinking about things this way is that it suggests that the way you are doing things right now is probably the right way because it is equivalent to reducing the import chain to an arithmetic normal form (i.e. a sum of products). I think the main thing we can do to make things nicer is just change the exception types to enforce the normal form (i.e. use the types to prevent nesting |
Actually, the analogy is even more exact if you just replace |
@Gabriel439 great analysis, it helps 👍 (I think I was already subconsciously convinced about all of this, thanks for putting it into words) I already thought about some typed way to prevent nested The closest thing to a solution I could come up with was making a Typeclass such that:
But it wouldn't work, because I think our "exception set" is not closed, and we can have other exceptions around (e.g. I think |
@f-f: Yeah, I don't think that properly fixing Dhall's import exception hierarchy is a requirement to merge this. I think you can merge this already as is |
Great, thanks. I'll clean this up and try to get it ready to merge before the release (probably tomorrow). |
@Gabriel439 Anyways, I'll look into the problem: regression for issue151 is not passing anymore so I think I did something wrong in the "inline |
@Gabriel439 fixed the failing test by throwing an exception when encountering a typecheck error, instead of adding it to the queue (which I think is the right behaviour). So if we have an import that resolves successfully but fails to typecheck, Dhall won't try other alternatives. E.g.: ❯ ./dhall <<< "./tests/regression/issue151a.dhall ? 1"
Use "dhall --explain" for detailed errors
↳ ./tests/regression/issue151a.dhall
Error: Unbound variable
A
If that's fine (together with the new changes) we might be done here 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two small comments
src/Dhall/Core.hs
Outdated
@@ -1207,6 +1222,8 @@ alphaNormalize (Note s e₀) = | |||
Note s e₁ | |||
where | |||
e₁ = alphaNormalize e₀ | |||
alphaNormalize (ImportAlt l₀ _r₀) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be:
alphaNormalize (ImportAlt l₀ r₀) =
ImportAlt l₁ r₁
where
l₁ = alphaNormalize l₀
r₁ = alphaNormalize r₀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You commented here:
for normalization, normalize the first branch and return that
The point is that by the time we run alphaNormalize
there should be no ImportAlt
constructors anymore.
I can add a comment on that line so it looks less surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment
tests/Import.hs
Outdated
catch | ||
(do | ||
_ <- Dhall.Import.load actualExpr | ||
return ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this line should be a fail
, too, otherwise if the import succeeds it will pass this test
@f-f: Also, to clarify, I agree that the behavior should be to not fallback to other imports if type-checking an import fails. |
@f-f: I still think that |
@Gabriel439 fixed that 👍 |
@f-f: Alright, looks great! Feel free to merge |
@Gabriel439 many thanks for all the feedback & support :) |
@f-f: You're welcome! :) |
This PR implements dhall-lang/dhall-lang#172
Does not work yet, pushing this work in progress here for feedback, suggestions, etc.
So far:
missing
keyword works, but we might want to change the standard (as now it gets parsed at the level of all the other imports, and I'm not quite sure how to implement in another place)ImportAlt
constructor forExpr
is there, but does not work yet - currently stuck at parsing, not quite sure how to include theImport
parameter in there yet, here.In general I don't think that constructor is right, so I'll think about this a bit harder and I welcome suggestions.