Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(examples): Implement Indexed AVL Tree #2703

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

linhpn99
Copy link
Contributor

@linhpn99 linhpn99 commented Aug 15, 2024

Description:

  • This PR introduces a new structure that wraps an AVL tree implementation
  • Offers a flexible IndexedTree structure, supporting field-based indexing and efficient data retrieval

Key Features:

  • Field-Based Indexing: Allows for indexing based on specific fields of Indexable values
  • Flexible Querying: Supports querying by indexed fields to quickly retrieve values
  • Integrated AVL Tree: Provides a balanced binary search tree for efficient data operations

Advantages:

  • Efficient Lookups: AVL tree ensures O(log N) complexity for insertions, deletions, and lookups
  • Scalability: Suitable for applications requiring balanced data storage and fast access
  • Simplified Management: The IndexedTree consolidates indexing into a single structure, reducing the complexity associated with managing and synchronizing multiple AVL trees. While it doesn’t directly reduce memory usage compared to using separate trees for different fields, it streamlines the process of maintaining index consistency and simplifies the overall management of indexed data
  • Support for Multiple Object Types: The IndexedTree can index and manage various object types efficiently, offering flexibility in how data is stored and queried. This allows for versatile use cases where different types of objects need to be indexed and retrieved based on different fields

Limitations:

  • Indexable Interface Requirement: Users must implement the Indexable interface for their data types to be used with the IndexedTree, adding an extra step for integration
  • Performance Trade-offs: The IndexedTree requires addition memory to maintain indexes and it may impact the performance of Set and Remove operations due to the need to update multiple indexes
  • No Advanced Indexing: Currently supports only primitive data types for indexing
  • No Reflection: The package is designed to work within Gno, lacking support for reflection or advanced type handling

Usage Demo:

package avl

import (
	"fmt"
	"gno.land/p/demo/ufmt"
        "gno.land/p/demo/avl"
)

// User implements Indexable interface
type User struct {
	Name  string
	Age   int
	IsMale bool
}

// Value returns the string representation of the field value
func (m User) Value(fieldName string) string {
	switch fieldName {
	case "Name":
		return m.Name
	case "Age":
		return ufmt.Sprintf("%d", m.Age)
	case "IsMale":
		return ufmt.Sprintf("%t", m.IsMale)
	default:
		return ""
	}
}

// Type returns the type of the struct as a string
func (m User) Type() string {
	return "User"
}

func init() {
	// Create a new IndexedTree
	tree := avl.NewIndexedTree()

	// Create sample objects
	obj1 := User{Name: "Alice", Age: 30, IsMale: false}
	obj2 := User{Name: "Bob", Age: 25, IsMale: true}
	obj3 := User{Name: "Charlie", Age: 30, IsMale: true}

	// Add indexes for Name and Age
	tree.NewIndex("User", "Name")
	tree.NewIndex("User", "Age")

	// Insert objects into the tree
	tree.Set("user1", obj1)
	tree.Set("user2", obj2)
	tree.Set("user3", obj3)

	// Perform some queries
	results, found := tree.QueryByField("User", "Age", "30")
	if found {
		// print out results
	}

	// Print the status of the tree
	fmt.Println(tree.Render())
	// Output:
	// IndexedTree Status:
	// Main Tree Size: 3
	// Index Count: 2
	// Indexes:
	//   IndexKey: User|Name, Count: 3
	//   IndexKey: User|Age, Count: 3
}
Contributors' checklist...
  • Added new tests, or not needed, or not feasible
  • Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory
  • Updated the official documentation or not needed
  • No breaking changes were made, or a BREAKING CHANGE: xxx message was included in the description
  • Added references to related issues and PRs
  • Provided any useful hints for running manual tests
  • Added new benchmarks to generated graphs, if any. More info here.

@linhpn99 linhpn99 requested a review from jaekwon as a code owner August 15, 2024 14:56
@github-actions github-actions bot added the 🧾 package/realm Tag used for new Realms or Packages. label Aug 15, 2024
Copy link

codecov bot commented Aug 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.45%. Comparing base (9396400) to head (2fb908a).
Report is 54 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2703      +/-   ##
==========================================
+ Coverage   60.44%   60.45%   +0.01%     
==========================================
  Files         563      563              
  Lines       75157    75157              
==========================================
+ Hits        45427    45437      +10     
+ Misses      26341    26329      -12     
- Partials     3389     3391       +2     
Flag Coverage Δ
contribs/gnodev 60.65% <ø> (-0.82%) ⬇️
contribs/gnofaucet 14.46% <ø> (ø)
gno.land 67.21% <ø> (ø)
gnovm 64.53% <ø> (+0.06%) ⬆️
misc/genstd 80.54% <ø> (ø)
misc/logos 20.23% <ø> (ø)
tm2 62.06% <ø> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@deelawn deelawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you publish some benchmarks demonstrating how this would make lookups for objects with particular faster than using the existing AVL tree implementation? On first look, it seems like QueryByField still requires scanning the entire tree. If doing a lookup for a field using an indexed subtree, wouldn't you want to limit iteration to values prefixed by what you are looking up? Like "|"? Here's an example of how you could more efficiently iterate:

package hello

import (
	"gno.land/p/demo/avl"
)

var tree avl.Tree

func init() {
	tree.Set("hello|world", "asfasdf")
	tree.Set("hello|goodbye", "hhhhhhh")
	tree.Set("other", "bbbbb")
}

func GetMatches() []string {
	var matches []string
	lookup := "hello"
	start := lookup + "|"
	end := lookup + string('|'+1)

	tree.Iterate(start, end, func(key string, value interface{}) bool {
		matches = append(matches, key, value.(string))
		return false
	})

	return matches
}

In what cases would using something like this be most useful? I can think of one -- when the values being stored in the tree are different struct types with different fields. I'd be curious to see a realm that has such a use case.

@linhpn99
Copy link
Contributor Author

linhpn99 commented Aug 17, 2024

Can you publish some benchmarks demonstrating how this would make lookups for objects with particular faster than using the existing AVL tree implementation? On first look, it seems like QueryByField still requires scanning the entire tree. If doing a lookup for a field using an indexed subtree, wouldn't you want to limit iteration to values prefixed by what you are looking up? Like "|"? Here's an example of how you could more efficiently iterate:

package hello

import (
	"gno.land/p/demo/avl"
)

var tree avl.Tree

func init() {
	tree.Set("hello|world", "asfasdf")
	tree.Set("hello|goodbye", "hhhhhhh")
	tree.Set("other", "bbbbb")
}

func GetMatches() []string {
	var matches []string
	lookup := "hello"
	start := lookup + "|"
	end := lookup + string('|'+1)

	tree.Iterate(start, end, func(key string, value interface{}) bool {
		matches = append(matches, key, value.(string))
		return false
	})

	return matches
}

In what cases would using something like this be most useful? I can think of one -- when the values being stored in the tree are different struct types with different fields. I'd be curious to see a realm that has such a use case.

Can you publish some benchmarks demonstrating how this would make lookups for objects with particular faster than using the existing AVL tree implementation? On first look, it seems like QueryByField still requires scanning the entire tree. If doing a lookup for a field using an indexed subtree, wouldn't you want to limit iteration to values prefixed by what you are looking up? Like "|"? Here's an example of how you could more efficiently iterate:

package hello

import (
	"gno.land/p/demo/avl"
)

var tree avl.Tree

func init() {
	tree.Set("hello|world", "asfasdf")
	tree.Set("hello|goodbye", "hhhhhhh")
	tree.Set("other", "bbbbb")
}

func GetMatches() []string {
	var matches []string
	lookup := "hello"
	start := lookup + "|"
	end := lookup + string('|'+1)

	tree.Iterate(start, end, func(key string, value interface{}) bool {
		matches = append(matches, key, value.(string))
		return false
	})

	return matches
}

In what cases would using something like this be most useful? I can think of one -- when the values being stored in the tree are different struct types with different fields. I'd be curious to see a realm that has such a use case.

Thank you for the improvement suggestion that you provided, i have implemented it and support multiple objects now. However, I am facing a challenge regarding where to write the benchmark for the IndexedTree since Gno test doesn't yet support the -bench flag, and I'm also not very familiar with writing tests in the gnovm/tests folder. Do you have any advice?

@linhpn99 linhpn99 requested a review from deelawn August 17, 2024 13:11
@linhpn99
Copy link
Contributor Author

#2194

@Kouteki Kouteki added the review/triage-pending PRs opened by external contributors that are waiting for the 1st review label Oct 3, 2024
@jefft0 jefft0 removed the review/triage-pending PRs opened by external contributors that are waiting for the 1st review label Oct 4, 2024
@jefft0
Copy link
Contributor

jefft0 commented Oct 4, 2024

Removed the "review team" label because this is already reviewed by deelawn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🧾 package/realm Tag used for new Realms or Packages.
Projects
Status: No status
Status: In Review
Development

Successfully merging this pull request may close these issues.

4 participants