Skip to content
This repository has been archived by the owner on Feb 6, 2023. It is now read-only.

More robust identifier parsing? #9

Open
glickbot opened this issue May 30, 2019 · 3 comments
Open

More robust identifier parsing? #9

glickbot opened this issue May 30, 2019 · 3 comments

Comments

@glickbot
Copy link
Contributor

This would fix #8 #6 in a more robust way

  • prefix any starting number with a 'n'
  • replace any non-unicode letter with either '_' or an empty string.

If they're done in that order and if non-unicode letters are replaced with an empty string, it would handle fields that start with non-unicode letters and numbers. Like this:

https://play.golang.org/p/yyG0XUihd59

package main

import (
	"fmt"
	"regexp"
)

const crazyFieldName = "$12t-his@is_Crazy#(*Æ"

func main() {
	decimalStart := regexp.MustCompile(`^\W*\d+`)
	replaceCrazy := regexp.MustCompile(`\W`)
	fixednum := decimalStart.ReplaceAllStringFunc(crazyFieldName, numFix)
	fixed := replaceCrazy.ReplaceAllStringFunc(fixednum, charFix)
	fmt.Println(fixed)
}

func numFix(match string) string {
	fmt.Printf("Number Fix: %s\n", match)
	return fmt.Sprintf("n%s", match)
}

func charFix(match string) string {
	fmt.Printf("Found %s\n", match)
	return ""
}

Number Fix: $12
Found $
Found -
Found @
Found #
Found (
Found *
Found Æ
n12thisis_Crazy

IMO it'd be great if we could let the user decide what they want replaced by using something like https://github.com/spf13/viper. It would allow them to use flags for most options, but could allow further customization if the user needs something more elaborate (i.e. replacing entire field names in case they don't like myType, etc). This would minimize hand-tweaking of the resulting code.

Thoughts? I can put a PR in unless you'd rather not have a dependency like spf13/viper.

@noahdietz
Copy link
Contributor

prefix any starting number with a 'n'

This seems fine to me

The question I have is: are all of these symbols and non-unicode characters allowed in OAS (where they would be used as names)? Is having a parameter with name: '@#Æ' valid in OAS? I cant see a regex that says exactly what symbols are allowed, but I have a hard time imagining this is an accepted convention.

could allow further customization if the user needs something more elaborate

Symbols & non-unicode characters aside, I believe staying as close as possible to the original naming in the document is important. I'd argue that if the user wants to change the name of the variable in the resulting generated code, they should probably change the name of the original source (parameter, enum, whatever it is). That said, the generator should be doing this, staying as close as possible to the original naming. Is that fair?

Generally, I would like to see use-cases for the above. Personally, I'm ok with saying that if one has Æ in their var name somewhere, that they should probably not :)

@glickbot
Copy link
Contributor Author

I believe staying as close as possible to the original naming in the document is important. I'd argue that if the user wants to change the name of the variable in the resulting generated code, they should probably change the name of the original source

Totally understand, my perspective has been from the stand point of a user that doesn't have control of the underlying API, so I was thinking this should handle potentially misbehaving API docs, etc. However, thinking about gnostic+plugin as being more of a true-to-OAS conversion (which would be way more maintainable), any customization a user would want could be implemented with a little JSON preprocessing (via JSON patch, etc), especially if they want to change the resulting method names, etc.

Is having a parameter with name: '@#Æ' valid in OAS?

Resetting my perspective a little, I'm curious though: with gnostic + plugins, we have 2 passes, the process to create the surface model, then the plugin process to turn the model into Go: wouldn't it be better to have the first pass creating the model perform any OAS compliance checks/processing, and limit the go plugin to make sure it's output is syntactically viable go? From what I can find, OAS limits some keys/names to ASCII alphanumeric ( and _,-, . ), but go allows any unicode letter to potentially be in an identifier. My suggestion would be to use a broad go specific regex that's only concerned about making syntactically viable go, and add OAS validation code to the surface model. Then, when/if OAS broadens it's conventions (i.e. add an umlaut), the plugin wouldn't have to change, just the surface model.

Ultimately though, our discovery docs should be OAS compliant (maybe I/we can influence that), and preprocessing can fix any of this... so while it might seem like I ranted, I don't actually feel strongly either way, I just enjoy the discussion and figuring out the scope of this project, and I'm 100% on board with whatever's decided. :)

@noahdietz
Copy link
Contributor

noahdietz commented May 31, 2019

my perspective has been from the stand point of a user that doesn't have control of the underlying API

This is valuable! Insight we don't necessarily (definitely don't) have.

gnostic+plugin as being more of a true-to-OAS conversion (which would be way more maintainable)

Yes, this world is much more well-defined and less chaotic (albeit OAS is a large, sometimes chaotic format)

wouldn't it be better to have the first pass creating the model perform any OAS compliance checks/processing

Yes! This would be the best place for any linting we want to do. This might be a more involved discussion in a separate issue on gnostic.

limit the go plugin to make sure it's output is syntactically viable go

Yes definitely. The Go plugin shouldn't be concerned with what is "valid OAS" but expect that what it is given is valid. Then, it can massage things into the Go-specific conventions as needed. This should be done regardless of the previous point about validating OAS in gnostic imo.

My suggestion would be to use a broad go specific regex that's only concerned about making syntactically viable go

sgtm

add OAS validation code to the surface model.

Yep, lets discuss this in an issue on gnostic, as mentioned earlier

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants