Package `bsc`'s pretty-printer and parser into standalone libraries #546

RyanGlScott · 2023-02-15T18:45:23Z

This is a proposal to change the way that bsc packages up its Haskell code slightly to allow easier reuse by external tools that need to consume or produce Bluespec code. I've already discussed parts of this proposal with various folks at Bluespec Inc., so this issue serves to lay out everything in one place.

Motivation

Bluespec is a convenient target language for high-level languages that want to compile down to hardware. In particular, I would like to be able to generate Bluespec code from a Haskell library, and some day I would also like to be able to ingest the generated code for analysis purposes. To put it another way, I need Haskell code for pretty-printing and parsing Bluespec.

To my knowledge, the only fully fledged Haskell code for doing this lives in bsc itself. It is certainly possible to extract the relevant subset of bsc's code needed for my purposes and turn it into a standalone library, and that is what I have done in the language-bluespec library. However, this is not a perfect solution, as I have now effectively created a fork of bsc. If there are any subsequent changes to the Bluespec language, this means that I would have to keep this fork in sync with new changes to bsc, which doesn't sound very scalable in the long term.

Proposal

I propose to:

Create a .cabal file that includes all of the AST-related data types for representing BH and BSV code, along with their Pretty instances. One possible name for this library could be bsc-syntax.
Create another .cabal file (which depends on the previous one) for parsing BH and BSV code. One possible name for this library could be bsc-parse.
If there is any code that these libraries rely on that does not obviously belong in either library, put that in a separate .cabal file of its own. Possible names for this library could be bsc-util or bsc-core.

These .cabal files likely would not be used by bsc's own build system, as it can continue to build them with Make as before. The benefit of putting things into .cabal files would come after uploading these .cabal-ized libraries to Hackage, where external users could then depend on them.

Future work

Things that are not in the immediate scope of this proposal (but perhaps would be worth considering later):

Redesigning the module hierarchy

Currently, bsc uses a mostly flat module naming hierarchy (see here). This is not the convention that most libraries on Hackage use nowadays, however. When I factored out code into language-bluespec, I renamed modules to things that include a Language.Bluespec.* prefix.

That being said, this is not a strict requirement. We can continue to use the module names as they currently exist in bsc for now. They might look a little odd when juxtaposed with other module names, but they will still work.

Making a "surface syntax" AST

Currently, the various AST data types have constructors that are only reachable in certain compiler passes. For instance, see this comment on IdK:

bsc/src/comp/CSyntax.hs

Lines 170 to 174 in 034050d

    
           data IdK 
        
                   = IdK Id 
        
                   | IdKind Id Kind 
        
                   -- this should not exist after typecheck 
        
                   | IdPKind Id PartialKind

IdPKind doesn't exist in the surface syntax of BH, and as such, it serves no utility for my personal needs. @quark17 suggested that it might be worthwhile to have distinct AST types that represent the raw, parsed syntax, which would then be turned into the current AST data types during compilation. That way, the "surface syntax" AST would only include things that are directly relevant to parsing and pretty-printing, which could potentially cut down on the surface area of the library.

That being said, doing this work would be somewhat involved and likely above my pay grade. I'm going to exclude it from this proposal, but it might be worth doing at some point.

Prior art

@quark17 notes here that there is a long-term goal of making the code in bsc more modular. I view this proposal as an important step towards that goal, as the pretty-printing and parsing portions of bsc are relatively self-contained pieces of functionality that could be re-used in other projects.

#166 puts every Haskell module in bsc into a single .cabal file in pursuit of being able to more easily integrate with package management tools. This proposal is largely working towards the same end goal, but instead of putting everything into one .cabal file, this proposal splits up the functionality into multiple smaller libraries. I don't believe this proposal is incompatible with #166, however—it would just mean that a "top-level" .cabal file that includes everything in bsc would need to depend on the smaller libraries.

Unresolved questions

Maintenance

Making .cabal files for everything is only one step of the process. The next step would be to make Hackage releases of the libraries to Hackage, most likely in conjunction with bsc releases. I am not clear on who would own this part of the proposal, but perhaps it makes the most sense for the bsc release manager to do this part.

Dealing with global state

Some parts of the bsc codebase depend on global state, which presents a challenge for refactoring. For example, consider this part of the code that relates to how identifiers are pretty-printed:

bsc/src/comp/IdPrint.hs

Lines 70 to 82 in bd141b5

    
           -- hack: suppress the package name for operators 
        
           getBSVIdString :: Id -> String 
        
           getBSVIdString a = (getBSVIdStringz a) 
        
           getBSVIdStringz :: Id -> String 
        
           getBSVIdStringz a 
        
               | getIdBase a == fsEmpty = internalError "CVPrint.getIdStr: empty identifier" 
        
               | getIdQual a == fsEmpty = getIdBaseStringz a 
        
               | not (isIdChar (head (getIdBaseStringz a))) = getIdBaseStringz a -- operators 
        
               | (not show_qual) && (getIdQual a == fsPrelude) = 
        
                     getIdBaseStringz a  -- suppress "Prelude::" unless flag is on 
        
               | (not show_qual) && (getIdQual a == fsPreludeBSV) = 
        
                     getIdBaseStringz a  -- suppress "Prelude::" unless flag is on 
        
               | otherwise = getIdQualString a ++ "::" ++ getIdBaseStringz a

The key part is show_qual, which is defined as:

bsc/src/comp/IdPrint.hs

Lines 18 to 19 in bd141b5

    
           show_qual :: Bool 
        
           show_qual = "-show-qualifiers" `elem` progArgs

And where progArgs is defined as:

bsc/src/comp/Libs/IOUtil.hs

Lines 8 to 19 in bd141b5

    
           {-# NOINLINE progArgs #-} 
        
           progArgs :: [String] 
        
           progArgs = unsafePerformIO $ do 
        
             args <- getArgs 
        
             bscopts <- getEnvDef "BSC_OPTIONS" "" 
        
             return $ (words bscopts) ++ args 
        
           getEnvDef :: String -> String -> IO String 
        
           getEnvDef e d = CE.catch (getEnv e) handler 
        
             where 
        
               handler :: CE.SomeException -> IO String 
        
               handler _ = return d

Eek! progArgs is reading the value of an environment variable and then using unsafePerformIO to return it in a pure context. This seems like something that we would not want to include in a general-purpose library—or, at the very least, we would want to make this option more explicit. I am unclear on what the best way to handle this should be.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package `bsc`'s pretty-printer and parser into standalone libraries #546

Package `bsc`'s pretty-printer and parser into standalone libraries #546

RyanGlScott commented Feb 15, 2023

Package bsc's pretty-printer and parser into standalone libraries #546

Package bsc's pretty-printer and parser into standalone libraries #546

Comments

RyanGlScott commented Feb 15, 2023

Motivation

Proposal

Future work

Redesigning the module hierarchy

Making a "surface syntax" AST

Prior art

Unresolved questions

Maintenance

Dealing with global state

Package `bsc`'s pretty-printer and parser into standalone libraries #546

Package `bsc`'s pretty-printer and parser into standalone libraries #546