Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:(ast) introduce Value to support concurrent operations #573

Closed
wants to merge 67 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
376c98c
feat: `RawNode` supports concurrently read
AsterDY Jan 10, 2024
73ae242
support `Set` and `Unset`
AsterDY Jan 13, 2024
7687283
validate raw json
AsterDY Jan 14, 2024
84d69d2
rename
AsterDY Jan 14, 2024
9d69d5c
`Add`
AsterDY Jan 14, 2024
4b4d6f5
add `Len()`
AsterDY Jan 14, 2024
65eda90
support `GetMany`
AsterDY Jan 14, 2024
4d81d1a
support `SetMany`
AsterDY Jan 14, 2024
3970254
change tests
AsterDY Jan 15, 2024
bcdb439
rename argues
AsterDY Jan 15, 2024
6eff64c
support `UnsetMany`
AsterDY Jan 15, 2024
2823dc6
refactor
AsterDY Jan 15, 2024
0456fa0
support `set_by_path`
AsterDY Jan 15, 2024
e02c4f4
support `UnsetByPath`
AsterDY Jan 16, 2024
0415087
rename
AsterDY Jan 16, 2024
45f273e
benchmarks
AsterDY Jan 16, 2024
28cb989
add GetMany benchs
AsterDY Jan 16, 2024
9ebe3b6
fmt
AsterDY Jan 16, 2024
4004a72
fmt
AsterDY Jan 16, 2024
b11ddc8
add setANY
AsterDY Jan 16, 2024
7b2b02f
refactor
AsterDY Jan 16, 2024
596e689
comment
AsterDY Jan 16, 2024
e742506
PopMany
AsterDY Jan 16, 2024
e5e2231
chore
AsterDY Jan 16, 2024
f369e59
adjust api
AsterDY Jan 16, 2024
b3543c5
test
AsterDY Jan 16, 2024
07fd54e
test
AsterDY Jan 16, 2024
209ab9e
fix
AsterDY Jan 16, 2024
6259b1d
NewValueJSONBytes
AsterDY Jan 17, 2024
2add2ea
add insert option
AsterDY Jan 17, 2024
6640687
add API
AsterDY Jan 17, 2024
71ade9f
not allow apppend
AsterDY Jan 17, 2024
dc7ce40
fix
AsterDY Jan 17, 2024
5cc773e
opt: avoid memcopy
AsterDY Jan 17, 2024
5b1a543
not validate get json
AsterDY Jan 17, 2024
8f45b78
validate json after get
AsterDY Jan 19, 2024
f2542d1
doc
AsterDY Jan 19, 2024
4d0df60
raw node
AsterDY Jan 19, 2024
ca7f293
1
AsterDY Jan 19, 2024
4a065cc
add benchmark
AsterDY Jan 19, 2024
7430da8
add `Delete()`
AsterDY Jan 19, 2024
83facca
add `from` argues for `Add()`
AsterDY Jan 19, 2024
7b9ccc3
more fuzz
AsterDY Jan 19, 2024
706fd6e
Merge branch 'main' into feat/rawnode
AsterDY Jan 19, 2024
6990b25
add native func `GetByPathNoValidate()`
AsterDY Jan 20, 2024
e9156df
use `fsm==nil` to decide no validate
AsterDY Jan 20, 2024
e4a2225
add `GetOptions` to control `Get` behavior
AsterDY Jan 20, 2024
5d989cf
add `SearchOptions`
AsterDY Jan 20, 2024
0fb3cc0
compatible
AsterDY Jan 20, 2024
52075b9
feat: use sonic decode
AsterDY Jan 20, 2024
f706349
fix
AsterDY Jan 20, 2024
a71587b
fix fuzz
AsterDY Jan 20, 2024
7a517d7
export `GetByPath`
AsterDY Jan 21, 2024
a8ba403
export
AsterDY Jan 21, 2024
c2660c0
use std strconv
AsterDY Jan 21, 2024
9142661
1
AsterDY Jan 21, 2024
8396b06
fix: skip number may contains space
AsterDY Jan 21, 2024
b8b3e27
export SKipFast
AsterDY Jan 21, 2024
5282d44
fmt
AsterDY Jan 22, 2024
583741e
fuzz
AsterDY Jan 22, 2024
8071847
add API `ValidSyntax`
AsterDY Jan 22, 2024
57ae213
more eligible `decode`
AsterDY Jan 22, 2024
6174ca8
export `Unquote`
AsterDY Jan 22, 2024
c3e1239
remove `allowInsert` on `Set()`
AsterDY Jan 22, 2024
39958d1
opt: add `ToNode()` \ `ToValue()` to avoid json validate
AsterDY Jan 22, 2024
dcc9ecc
fix test
AsterDY Jan 22, 2024
787b8dd
fix
AsterDY Jan 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,22 @@ println(string(buf) == string(exp)) // true
- iteration: `Values()`, `Properties()`, `ForEach()`, `SortKeys()`
- modification: `Set()`, `SetByIndex()`, `Add()`

### Ast.Value
Due to `ast.Node`'s **transversely-lazy-load** design, it ** CANNOT be read concurrently**. If your business has such scenario, you can use `ast.Value`:
```go
var opts = sonic.SearchOptions{
Copy: false, // to control returning JSON by Copy instead of Reference
Validate: false // to control if validate returned JSON syntax
}
val, err := opts.GetFromString(json, paths...) // skip and search JSON
any, err := val.Interface() // converts to go primitive type
```

#### APIs
Most of its APIs are same with `ast.Node`'s, including both `Get` and `Set`. Besides:
- It provide `GetMany` \ `SetMany` \ `UnsetMany` \ `AddMany` \ `PopMany` to read or write multiple values once, in order to **reduce the overhead of repeatedly visiting path**.
-

### Ast.Visitor
Sonic provides an advanced API for fully parsing JSON into non-standard types (neither `struct` not `map[string]interface{}`) without using any intermediate representation (`ast.Node` or `interface{}`). For example, you might have the following types which are like `interface{}` but actually not `interface{}`:
```go
Expand Down
61 changes: 58 additions & 3 deletions api.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,10 @@
package sonic

import (
`io`
"io"

`github.com/bytedance/sonic/ast`
"github.com/bytedance/sonic/ast"
"github.com/bytedance/sonic/internal/rt"
)

// Config is a combination of sonic/encoder.Options and sonic/decoder.Options
Expand Down Expand Up @@ -184,7 +185,7 @@ func UnmarshalString(buf string, val interface{}) error {
// Note, the api expects the json is well-formed at least,
// otherwise it may return unexpected result.
func Get(src []byte, path ...interface{}) (ast.Node, error) {
return GetFromString(string(src), path...)
return GetFromString(rt.Mem2Str(src), path...)
}

// GetFromString is same with Get except src is string,
Expand All @@ -193,6 +194,60 @@ func GetFromString(src string, path ...interface{}) (ast.Node, error) {
return ast.NewSearcher(src).GetByPath(path...)
}

// Set write val to src according to path and return new json
func Set(src []byte, val interface{}, path ...interface{}) ([]byte, error) {
s, err := ast.NewSearcher(rt.Mem2Str(src)).SetValueByPath(ast.NewValue(val), path...)
if err != nil {
return nil, err
}
return rt.Str2Mem(s), nil
}

// SetFromString is same with Set except src is string,
// which can reduce unnecessary memory copy.
func SetFromString(src string, val interface{}, path ...interface{}) (string, error) {
return ast.NewSearcher(src).SetValueByPath(ast.NewValue(val), path...)
}

// Delete remove val to src according to path and return new json
func Delete(src []byte, path ...interface{}) (string, error) {
return ast.NewSearcher(rt.Mem2Str(src)).DeleteByPath(path...)
}

// Delete remove val to src according to path and return new json
func DeleteFromString(src string, path ...interface{}) (string, error) {
return ast.NewSearcher(src).DeleteByPath(path...)
}

// SearchOptions
type SearchOptions struct {
Copy bool // if copy returned JSON to reduce memory usage
Validate bool // if validate returned JSON for safty
}

// Get searches the given path from json,
// and returns its representing ast.Value.
//
// Each path arg must be integer or string:
// - Integer is target index(>=0), means searching current value as array.
// - String is target key, means searching current value as object.
//
//
// Note, the api expects the json is well-formed at least,
// otherwise it may return unexpected result.
func (opts SearchOptions) Get(src []byte, path ...interface{}) (ast.Value, error) {
return opts.GetFromString(rt.Mem2Str(src), path...)
}

// GetFromString is same with GetValue except src is string,
func (opts SearchOptions) GetFromString(src string, path ...interface{}) (ast.Value, error) {
s := ast.NewSearcher(src)
s.Validate(opts.Validate)
s.Copy(opts.Validate)
return s.GetValueByPath(path...)
}


// Valid reports whether data is a valid JSON encoding.
func Valid(data []byte) bool {
return ConfigDefault.Valid(data)
Expand Down
84 changes: 62 additions & 22 deletions ast/api_amd64.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import (
var typeByte = rt.UnpackEface(byte(0)).Type

//go:nocheckptr
func quote(buf *[]byte, val string) {
func Quote(buf *[]byte, val string) {
*buf = append(*buf, '"')
if len(val) == 0 {
*buf = append(*buf, '"')
Expand Down Expand Up @@ -73,7 +73,7 @@ func quote(buf *[]byte, val string) {
*buf = append(*buf, '"')
}

func unquote(src string) (string, types.ParsingError) {
func Unquote(src string) (string, types.ParsingError) {
return uq.String(src)
}

Expand Down Expand Up @@ -121,37 +121,77 @@ func (self *Parser) skipFast() (int, types.ParsingError) {
return start, 0
}

func (self *Parser) getByPath(path ...interface{}) (int, types.ParsingError) {
func (self *Parser) getByPath(path ...interface{}) (int, types.ValueType, types.ParsingError) {
fsm := types.NewStateMachine()
start := native.GetByPath(&self.s, &self.p, &path, fsm)
types.FreeStateMachine(fsm)
runtime.KeepAlive(path)
if start < 0 {
return self.p, types.ParsingError(-start)
return self.p, 0, types.ParsingError(-start)
}
return start, 0
t := switchRawType(self.s[start])
if t == _V_NUMBER {
self.p = 1 + backward(self.s, self.p-1)
}
return start, t, 0
}

func (self *Searcher) GetByPath(path ...interface{}) (Node, error) {
var err types.ParsingError
var start int
func (self *Parser) getByPathNoValidate(path ...interface{}) (int, types.ValueType, types.ParsingError) {
start := native.GetByPath(&self.s, &self.p, &path, nil)
runtime.KeepAlive(path)
if start < 0 {
return self.p, 0, types.ParsingError(-start)
}
t := switchRawType(self.s[start])
if t == _V_NUMBER {
self.p = 1 + backward(self.s, self.p-1)
}
return start, t, 0
}

self.parser.p = 0
start, err = self.parser.getByPath(path...)
if err != 0 {
// for compatibility with old version
if err == types.ERR_NOT_FOUND {
return Node{}, ErrNotExist
func DecodeString(src string, pos int, needEsc bool) (v string, ret int, hasEsc bool) {
p := NewParserObj(src)
p.p = pos
switch val := p.decodeValue(); val.Vt {
case types.V_STRING:
str := p.s[val.Iv : p.p-1]
/* fast path: no escape sequence */
if val.Ep == -1 {
return str, p.p, false
} else if !needEsc {
return str, p.p, true
}
if err == types.ERR_UNSUPPORT_TYPE {
panic("path must be either int(>=0) or string")
}
return Node{}, self.parser.syntaxError(err)
/* unquote the string */
out, err := Unquote(str)
/* check for errors */
if err != 0 {
return "", -int(err), true
} else {
return out, p.p, true
}
default:
return "", -int(_ERR_UNSUPPORT_TYPE), false
}
}

// ValidSyntax check if a json has a valid JSON syntax,
// while not validate UTF-8 charset
func ValidSyntax(json string) bool {
fsm := types.NewStateMachine()
p := 0
ret := native.ValidateOne(&json, &p, fsm, 0)
types.FreeStateMachine(fsm)

if ret < 0 {
return false
}

t := switchRawType(self.parser.s[start])
if t == _V_NONE {
return Node{}, self.parser.ExportError(err)
/* check for trailing spaces */
for ;p < len(json); p++ {
if !isSpace(json[p]) {
return false
}
}
return newRawNode(self.parser.s[start:self.parser.p], t), nil

return true
}
85 changes: 59 additions & 26 deletions ast/api_compat.go
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
//go:build !amd64 || !go1.16 || go1.22
// +build !amd64 !go1.16 go1.22

/*
Expand All @@ -19,23 +20,24 @@
package ast

import (
`encoding/base64`
`encoding/json`
`fmt`
"encoding/base64"
"encoding/json"
"runtime"
"unsafe"

`github.com/bytedance/sonic/internal/native/types`
`github.com/bytedance/sonic/internal/rt`
"github.com/bytedance/sonic/internal/native/types"
"github.com/bytedance/sonic/internal/rt"
)

func init() {
println("WARNING: sonic only supports Go1.16~1.21 && CPU amd64, but your environment is not suitable")
}

func quote(buf *[]byte, val string) {
func Quote(buf *[]byte, val string) {
quoteString(buf, val)
}

func unquote(src string) (string, types.ParsingError) {
func Unquote(src string) (string, types.ParsingError) {
sp := rt.IndexChar(src, -1)
out, ok := unquoteBytes(rt.BytesFrom(sp, len(src)+2, len(src)+2))
if !ok {
Expand Down Expand Up @@ -88,37 +90,68 @@ func (self *Node) encodeInterface(buf *[]byte) error {
return nil
}

func (self *Searcher) GetByPath(path ...interface{}) (Node, error) {
self.parser.p = 0

var err types.ParsingError
func (self *Parser) getByPath(path ...interface{}) (int, types.ValueType, types.ParsingError) {
for _, p := range path {
if idx, ok := p.(int); ok && idx >= 0 {
if err = self.parser.searchIndex(idx); err != 0 {
return Node{}, self.parser.ExportError(err)
if _, err := self.searchIndex(idx); err != 0 {
return self.p, 0, err
}
} else if key, ok := p.(string); ok {
if err = self.parser.searchKey(key); err != 0 {
return Node{}, self.parser.ExportError(err)
if _, err := self.searchKey(key); err != 0 {
return self.p, 0, err
}
} else {
panic("path must be either int(>=0) or string")
}
}
start, e := self.skip()
if e != 0 {
return self.p, 0, e
}
t := switchRawType(self.s[start])
if t == _V_NUMBER {
self.p = 1 + backward(self.s, self.p-1)
}
return start, t, 0
}

var start = self.parser.p
if start, err = self.parser.skip(); err != 0 {
return Node{}, self.parser.ExportError(err)
func (self *Parser) getByPathNoValidate(path ...interface{}) (int, types.ValueType, types.ParsingError) {
return self.getByPath(path...)
}

//go:nocheckptr
func DecodeString(src string, pos int, needEsc bool) (v string, ret int, hasEsc bool) {
ret, ep := skipString(src, pos)
if ep == -1 {
(*rt.GoString)(unsafe.Pointer(&v)).Ptr = rt.IndexChar(src, pos+1)
(*rt.GoString)(unsafe.Pointer(&v)).Len = ret - pos - 2
return v, ret, false
} else if !needEsc {
return src[pos+1:ret-1], ret, true
}
ns := len(self.parser.s)
if self.parser.p > ns || start >= ns || start>=self.parser.p {
return Node{}, fmt.Errorf("skip %d char out of json boundary", start)

vv, ok := unquoteBytes(rt.Str2Mem(src[pos:ret]))
if !ok {
return "", -int(types.ERR_INVALID_CHAR), true
}

t := switchRawType(self.parser.s[start])
if t == _V_NONE {
return Node{}, self.parser.ExportError(err)
runtime.KeepAlive(src)
return rt.Mem2Str(vv), ret, true
}

// ValidSyntax check if a json has a valid JSON syntax,
// while not validate UTF-8 charset
func ValidSyntax(json string) bool {
p, _ := skipValue(json, 0)
if p < 0 {
return false
}
/* check for trailing spaces */
for ;p < len(json); p++ {
if !isSpace(json[p]) {
return false
}
}

return newRawNode(self.parser.s[start:self.parser.p], t), nil
}
return true
}
Loading
Loading