Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/parser: feature request: parse declarations only #40822

Closed
matthewmueller opened this issue Aug 16, 2020 · 9 comments
Closed

go/parser: feature request: parse declarations only #40822

matthewmueller opened this issue Aug 16, 2020 · 9 comments

Comments

@matthewmueller
Copy link

@matthewmueller matthewmueller commented Aug 16, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/m/Library/Caches/go-build"
GOENV="/Users/m/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/m/Go/pkg/mod"
GONOPROXY="github.com/matthewmueller"
GONOSUMDB="github.com/matthewmueller"
GOOS="darwin"
GOPATH="/Users/m/Go"
GOPRIVATE="github.com/matthewmueller"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/m/Go/src/github.com/matthewmueller/duo/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/4f/tcxcr6_55v9bp38d8g4hjlf80000gn/T/go-build905655404=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I'd like to be able to parse a file or package with only the type declarations and function signatures, without the function bodies. I find myself wanting this quite a bit for code generation, where you want the shapes, but you don't care about the implementations.

What did you expect to see?

In go/parser/interface.go, you'll find the following modes:

// A Mode value is a set of flags (or 0).
// They control the amount of source code parsed and other optional
// parser functionality.
//
type Mode uint

const (
	PackageClauseOnly Mode             = 1 << iota // stop parsing after package clause
	ImportsOnly                                    // stop parsing after import declarations
	ParseComments                                  // parse comments and add them to AST
	Trace                                          // print a trace of parsed productions
	DeclarationErrors                              // report declaration errors
	SpuriousErrors                                 // same as AllErrors, for backward-compatibility
	AllErrors         = SpuriousErrors             // report all errors (not just the first 10 on different lines)
)

I'd love an additional mode, perhaps DeclarationsOnly. This would parse a bit more than ImportsOnly, but not the whole source tree.

What did you see instead?

Unfortunately, I couldn't find a way to parse at this specific granularity.

@davecheney
Copy link
Contributor

@davecheney davecheney commented Aug 16, 2020

Which declarations would be included? only var, type, const, and funcs? at the file level?

@matthewmueller
Copy link
Author

@matthewmueller matthewmueller commented Aug 16, 2020

That's a good question. For my use-case, it'd just be function signatures and struct declarations, but I could see all general declarations in that group as well as constants.

I think where I'd draw the line is parsing everything except function bodies and variables. That would still eliminate parsing large chunks of the most files.

@josharian
Copy link
Contributor

@josharian josharian commented Aug 16, 2020

Parsing is generally really cheap. Typechecking is expensive, and go/types supports not typechecking function bodies. Also, the parser would still have to do the work to find out where the function body ends and ensure that there are no parsing errors inside the function body.

Have you determined that parsing is a bottleneck or performance issue for your application?

@matthewmueller
Copy link
Author

@matthewmueller matthewmueller commented Aug 16, 2020

Have you determined that parsing is a bottleneck or performance issue for your application?

I have not. My line of thinking was try to eliminate as much work as possible and since there are already other modes for eliminating parsing, maybe we could add one more.

I'm running this inside a file watching loop that parses, re-generates, then runs the code on each change, so I'm trying to reduce latency wherever possible.

@davecheney
Copy link
Contributor

@davecheney davecheney commented Aug 16, 2020

That's a good question. For my use-case, it'd just be function signatures and struct declarations, but I could see all general declarations in that group as well as constants.

I think where I'd draw the line is parsing everything except function bodies and variables. That would still eliminate parsing large chunks of the most files.

What about func, var, type, or const declarations inside a function?

@matthewmueller
Copy link
Author

@matthewmueller matthewmueller commented Aug 17, 2020

the parser would still have to do the work to find out where the function body ends

This is a great point @josharian. This may make my suggestion invalid since it seems like you'd still need to do some bookkeeping as you're scanning over tokens. Then at that point, why not parse it?

What about func, var, type, or const declarations inside a function?

@davecheney my suggestion would be to ignore everything inside the function body, including those declarations. Pulling out some random code and re-purposing the diff syntax a bit. The red is what I'd expect to be ignored with this mode:

package convo

import (
	"fmt"
	"regexp"
	"strings"

	"github.com/matthewmueller/jack/internal/pogo/standupuser"

	"github.com/dustin/go-humanize/english"
	"github.com/matthewmueller/convo"
	"github.com/matthewmueller/jack/internal/pogo/standup"
	"github.com/matthewmueller/jack/internal/pogo/team"
	"github.com/pkg/errors"
)

// StandupDeleteIntent intent
const StandupDeleteIntent = "standup_delete"

// StandupDeleteInput input
const StandupDeleteInput = "delete standup"

- // StandupDelete topic
- var StandupDelete = NewTopic(StandupDeleteIntent).
- 	Slot("standup_name", regexp.MustCompile(".*")).
- 	Slot("confirm_delete", regexp.MustCompile(".*")).
- 	Input(StandupDeleteInput).
- 	Topic(func(ctx *Context, intent convo.Intent) convo.Topic {
- 		return &standupDelete{ctx, intent}
- 	})

- // replies
- var (
- 	StandupDeleteIntro = Template(`
- 		Are you sure you want to delete {{.standup_name}}? You cannot undo this action.
- 	`)
- 
- 	StandupDeleteIntroWithActiveUsers = Template(`
- 		Are you sure you want to delete *{{.standup_name}}*? This will also delete everyone's past reports for this standup. You cannot undo this action.
- 
- 
- 		Right now {{.user_list}} participate in standup.
- 	`)
- 
- )

// standupDelete struct
type standupDelete struct {
	*Context
	convo.Intent
}

- var _ convo.Topic = (*standupDelete)(nil)

// Handle team onboard
func (c *standupDelete) Handle(r *convo.Request, w *convo.Response) {
- 	if nevermind(r.Text) {
- 		w.Text = Nevermind()
- 		return
- 	}
- 
- 	findingStandup := r.Slots["standup_name"] != nil
- 	confirmingDelete := r.Slots["confirm_delete"] != nil
- 
- 	switch {
- 	case findingStandup:
- 		c.FindingStandup(r, w)
- 	case confirmingDelete:
- 		c.ConfirmingDelete(r, w)
- 	default:
- 		c.Introduce(r, w)
- 	}
}

func (c *standupDelete) Introduce(r *convo.Request, w *convo.Response) {
- 	// get the bot token
- 	botToken := r.Meta["bot_access_token"]
- 	if botToken == "" {
- 		c.catch(r, w, errors.New("no bot access token found"))
- 		return
- 	}
- 
- 	// get the team
- 	tm, err := team.FindByBotAccessToken(c.db, botToken)
- 	if err != nil {
- 		c.catch(r, w, errors.Wrap(err, "error getting the team"))
- 		return
- 	}
- 
- 	// Find all the standups for the team
- 	// TODO: access control
- 	standups, err := standup.FindMany(c.db, standup.NewFilter().TeamID(tm.ID))
- 	if err != nil {
- 		c.catch(r, w, errors.Wrap(err, "error finding many standups"))
- 		return
- 	}
- 	return
}
 
// Finding the standup, because we received a name
func (c *standupDelete) FindingStandup(r *convo.Request, w *convo.Response) {
- 	standupName := strings.ToLower(*r.Slots["standup_name"])
- 
- 	// get the bot token
- 	botToken := r.Meta["bot_access_token"]
- 	if botToken == "" {
- 		c.catch(r, w, errors.New("no bot access token found"))
- 		return
- 	}
}

func (c *standupDelete) ConfirmingDelete(r *convo.Request, w *convo.Response) {
-	confirm := *r.Slots["confirm_delete"]
-
-	// same as nevermind
-	if no(confirm) {
-		w.Text = Nevermind()
-		return
-	}
-
-	// catch all
-	if !yes(confirm) {
-		w.Intent = StandupDeleteIntent
-		w.Slot = "confirm_delete"
-		w.Text = StandupDeleteConfirmCatchAll()
-		return
-	}
}
@ianlancetaylor ianlancetaylor changed the title Feature: Parse declarations only go/parser: feature request: parse declarations only Aug 18, 2020
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 18, 2020

@griesemer
Copy link
Contributor

@griesemer griesemer commented Aug 19, 2020

As @josharian has already observed, it's not trivial to "skip" over the contents of function bodies; i.e., at least if we don't know if they are correct. If we know that the code is correct, one could accelerate the parsing of function bodies by only counting opening/closing braces (they have to match up); but one would still have to read and tokenize the source text. One could do a little experiment and see what the performance difference would be, but it may not matter in the overall application (presumably, the result of parsing is used somehow).

@matthewmueller Is parsing a particularly time-consuming component of your application?

@matthewmueller
Copy link
Author

@matthewmueller matthewmueller commented Aug 25, 2020

@matthewmueller Is parsing a particularly time-consuming component of your application?

I did some benchmarking of the parser in a more isolated environment and it's very fast. 2-4ms fast.. 🙌

Closing. Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.