Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Approximate match (edit distance and hamming distance) #412

Open
Haigegege opened this issue Sep 6, 2023 · 0 comments
Open

Approximate match (edit distance and hamming distance) #412

Haigegege opened this issue Sep 6, 2023 · 0 comments

Comments

@Haigegege
Copy link

Haigegege commented Sep 6, 2023

Hi,
I am studying about edit distance and hamming distance in my hyperscan env.
but it's a little hard for me to understand how to find whitch one is the best matching.
my example like this

type ScanContext struct {
	*bytes.Buffer
	data []byte
}

func main() {
    // distance = 3
    // one of pattern is like "tipsytemasagronomicos.com"
    patterns = append(patterns, hs.NewPattern(pattern, hs.Caseless, hs.EditDistance(uint32(distance))))
    ...
    
    // create hyperscan db
    hsdb, err := hs.NewBlockDatabase(patterns...)
    scratch, err := hs.NewScratch(*hsdb)
    ...
    
    buf := new(bytes.Buffer)
    input:= "13243412tipsytemasagr0n0mic0s.c0m"
    if err := (*d.HsDatabase).Scan([]byte(input), scratch, handleMatch, ScanContext{buf, []byte(input)}); err != nil {
		    fmt.Println(err)
    }
    fmt.Printf("match: %s\n", buf.String())
    if len(buf.String()) > 0 {
        log.Println("Levenshtein distance match:", true)
    }
}

func handleMatch(id uint, from, to uint64, flags uint, data interface{}) error {
	ctx, _ := data.(ScanContext)
	fmt.Println("from:", from)
	fmt.Println("to:", to)
	fmt.Println("data:", string(ctx.data))
	if from < to {
		_, err := fmt.Fprintf(ctx.Buffer, "%s", ctx.data[from:to])
		if err != nil {
			return err
		}
	}

	return nil
}

The result is like that:
image
image

So, how can I do for next step, can you guys show me some example about the approximate match?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant