Skip to content

bufbuild/protocompile

Repository files navigation

The Buf logo

Protocompile

Build Report Card GoDoc

This repo contains a parsing/linking engine for Protocol Buffers, written in pure Go. It is suitable as an alternative to protoc (Google's official reference compiler for Protocol Buffers). This is the compiler that powers Buf and its bevy of tools.

This repo is also the spiritual successor to the github.com/jhump/protoreflect/desc/protoparse package. If you are looking for a newer version of protoparse that natively works with the newer Protobuf runtime API for Go (google.golang.org/protobuf), you have found it!

Protocol Buffers

If you've come across this repo but don't know what Protocol Buffers are, you might acquaint yourself with the official documentation. Protocol Buffers, or Protobuf for short, is an IDL for describing APIs and data structures and also a binary encoding format for efficiently transmitting and storing that data.

If you want to know more about the language itself, which is what this repo implements, take a look at Buf's Protobuf Guide, which includes a very detailed language specification.

Descriptors

Descriptors are the "lingua franca" for describing Protobuf data schemas. They are the basis of runtime features like reflection and dynamic messages. They are also the output of a Protobuf compiler: a compiler can produce them and write them to a file (whose contents are the binary-encoded form of a FileDescriptorSet) or send them to a plugin to generate code for a particular programming language.

Descriptors are similar to nodes in a syntax tree: the contents of a file descriptor correspond closely to the elements in the source file from which it was generated. Also, the descriptor model's data structures are themselves defined in Protobuf.

Using This Repo

The primary API of this repo is in this root package: github.com/bufbuild/protocompile. This is the suggested entry point and provides a type named Compiler, for compiling Protobuf source files into descriptors. There are also numerous sub-packages, most of which implement various stages of the compiler. Here's an overview (not in alphabetical order):

  • protocompile: This is the entry point, used to configure and initiate a compilation operation.
  • parser: This is the first stage of the compiler. It parses Protobuf source code and produces an AST. This package can also generate a file descriptor proto from an AST.
  • ast: This package models an Abstract Syntax Tree (AST) for the Protobuf language.
  • linker: This is the second stage of the compiler. The descriptor proto (generated from an AST) is linked, producing a more useful data structure than simple descriptor protos. This step also performs numerous validations on the source, like making sure that all type references are correct and that sources don't try to define two elements with the same name.
  • options: This is the next stage of the compiler: interpreting options. The linked data structures that come from the previous stage are used to validate and interpret all options.
  • sourceinfo: This is the last stage of the compiler: generating source code info. Source code info contains metadata that maps elements in the descriptor to the location in the original source file from which it came. This includes access to comments. In order to provide correct source info for options, it must happen last, after options have been interpreted.
  • reporter: This package provides error types generated by the compiler and interfaces used by the compiler to report errors and warnings to the calling code.
  • walk: This package provides functions for walking through all of the elements in a descriptor (or descriptor proto) hierarchy.
  • protoutil: This package contains some other useful functions for interacting with Protobuf descriptors.

Migrating from protoparse

There are a few differences between this repo and its predecessor, github.com/jhump/protoreflect/desc/protoparse.

  • If you want to include "standard imports", for the well-known files that are included with protoc, you have to do so explicitly. To do this, wrap your resolver using protocompile.WithStandardImports.
  • If you used protoparse.FileContentsFromMap, in this new repo you'll use a protocompile.SourceResolver and then use protocompile.SourceAccessorFromMap as its accessor function.
  • If you used Parser.ParseToAST, you won't use the protocompile package but instead directly use parser.Parse in this repo's parser sub-package. This returns an AST for the given file contents.
  • If you used Parser.ParseFilesButDoNotLink, that is still possible in this repo, but not provided directly via a single function. Instead, you need to take a few steps:
    1. Parse the source using parser.Parse. Then use parser.ResultFromAST to construct a result that contains a file descriptor proto.
    2. Interpret whatever options can be interpreted without linking using options.InterpretUnlinkedOptions. This may leave some options in the descriptor proto uninterpreted (including all custom options).
    3. If you want source code info for the file, finally call sourceinfo.GenerateSourceInfo using the index returned from the previous step and store that in the file descriptor proto.