Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a concrete syntax for macaw #247

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions base/macaw-base.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ library
galois-dwarf >= 0.2.2,
IntervalMap >= 0.5,
lens >= 4.7,
megaparsec >= 7 && < 10,
mtl,
parameterized-utils >= 2.1.0 && < 2.2,
prettyprinter >= 1.7.0,
Expand Down Expand Up @@ -78,6 +79,9 @@ library
Data.Macaw.Memory.LoadCommon
Data.Macaw.Memory.Permissions
Data.Macaw.Memory.Symbols
Data.Macaw.Syntax.Atom
Data.Macaw.Syntax.Parser
Data.Macaw.Syntax.SExpr
Data.Macaw.Types
Data.Macaw.Utils.Changed
Data.Macaw.Utils.IncComp
Expand Down
131 changes: 131 additions & 0 deletions base/src/Data/Macaw/Syntax/Atom.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
{-# LANGUAGE OverloadedStrings #-}
-- | The atoms of the concrete syntax for macaw
module Data.Macaw.Syntax.Atom (
Keyword(..)
, keywords
, AtomName(..)
, Atom(..)
)
where

import qualified Data.Map.Strict as Map
import qualified Data.Text as T
import Numeric.Natural ( Natural )

-- | Macaw syntax keywords
--
-- These are the keywords for the *base* macaw IR (i.e., without
-- architecture-specific extensions). The architecture-specific operations are
-- parsed as 'AtomName's initially and resolved by architecture-specific parsers
-- at the atom level.
data Keyword = BVAdd
| BVSub
| BVMul
| BVAdc
| BVSbb
| BVAnd
| BVOr
| BVXor
| BVShl
| BVShr
| BVSar
| PopCount
| Bsf
| Bsr
| BVComplement
| Mux
| Lt
| Le
| Sle
| Slt
-- Syntax
| Assign
-- Statements
| Comment
| InstructionStart
| WriteMemory
| CondWriteMemory
-- Expressions
| ReadMemory
-- Boolean operations
| Eq_
| Not_
| And_
| Or_
| Xor_
-- Endianness
| BigEndian
| LittleEndian
-- MemRepr
| BVMemRepr
-- Types
| Bool_
| BV_
| Float_
| Tuple_
| Vec_
-- Values
| True_
| False_
| BV
| Undefined
deriving (Eq, Ord, Show)

-- | Uninterpreted atoms
newtype AtomName = AtomText T.Text
deriving (Eq, Ord, Show)

data Atom = Keyword !Keyword -- ^ Keywords include all of the built-in expressions and operators
| AtomName !AtomName -- ^ Non-keyword syntax atoms (to be interpreted at parse time)
| Register !Natural -- ^ A numbered local register (e.g., @r12@)
| Address !Natural -- ^ An arbitrary address rendered in hex ('ArchAddrWord' or 'SegoffAddr')
| Integer_ !Integer -- ^ Literal integers
| Natural_ !Natural -- ^ Literal naturals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is the difference between a Natural and an Integer? Is an Integer just a Natural with a ± sign? I ask since there are some operations that seem to require an Integer rather than a Natural argument, such as bv, whose second argument must be an Integer. Does this imply that bv 8 +0 is legal but bv 8 0 is illegal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't decided yet. There are a bunch of places where a negative number is not legal but where I don't want to have partial cases due to range checks on Integer. My current thought is that I would require a sign on Integer (even for positive numbers), making all unsigned values Natural.

I think your observation about bv is good (and pointing out an error in the code): I would prefer that it be (bv 8 0) (i.e., the actual value is also a Natural).

Note that this entire syntax is not really meant for human consumption at all. This will all be machine generated and machine parsed. It is only textual so that I don't go crazy debugging it, so I don't mind if it is ugly.

| String_ !T.Text -- ^ Literal strings
deriving (Eq, Ord, Show)

keywords :: Map.Map T.Text Keyword
keywords = Map.fromList [ ("bv-add", BVAdd)
, ("bv-sub", BVSub)
, ("bv-mul", BVMul)
, ("bv-adc", BVAdc)
, ("bv-sbb", BVSbb)
, ("bv-and", BVAnd)
, ("bv-or", BVOr)
, ("bv-xor", BVXor)
, ("bv-shl", BVShl)
, ("bv-shr", BVShr)
, ("bv-sar", BVSar)
, ("bv-complement", BVComplement)
, ("popcount", PopCount)
, ("bit-scan-forward", Bsf)
, ("bit-scan-reverse", Bsr)
, ("mux", Mux)
, ("eq", Eq_)
, ("<", Lt)
, ("<=", Le)
, ("<$", Slt)
, ("<=$", Sle)
, ("not", Not_)
, ("and", And_)
, ("or", Or_)
, ("xor", Xor_)
, ("Bool", Bool_)
, ("BV", BV_)
, ("Float", Float_)
, ("Tuple", Tuple_)
, ("Vec", Vec_)
, ("true", True_)
, ("false", False_)
, ("bv", BV)
, ("undefined", Undefined)
, ("read-memory", ReadMemory)
, (":=", Assign)
, ("comment", Comment)
, ("instruction-start", InstructionStart)
, ("write-memory", WriteMemory)
, ("cond-write-memory", CondWriteMemory)
, ("big-endian", BigEndian)
, ("little-endian", LittleEndian)
, ("bv-mem", BVMemRepr)
]
Loading