<a href="https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/aql_2022.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#AQL 2022 Review
Happy New Year! 
In this notebook we will review some of the great AQL additions that came out in 2022. You can use this notebook to quickly get hands-on with these new features but be sure checkout the [release highlights](https://www.arangodb.com/docs/stable/release-notes.html#whats-new) for more info.

This notebook accompanies the [AQL in 2022 webinar](https://hopin.com/events/aql-in-2023).

# Setup

Before getting started with ArangoDB, we need to prepare our environment and create a database on ArangoDB's managed Service Oasis.

In [None]:
%%capture
!pip install adb-cloud-connector
!pip3 install "python-arango>=5.0"

In [None]:
from arango import ArangoClient

from adb_cloud_connector import get_temp_credentials

con = get_temp_credentials(tutorialName="AQL2022")

print(con)

You can access the WebUI via https://tutorials.arangodb.cloud:8529/ and use the credentials shown above.

In [None]:
client = ArangoClient(hosts=con['url'])
database = client.db(con['dbName'], username=con['username'], password=con['password'])
aql = database.aql

In [None]:
import requests
def create_example_graph(graph_name):
  if database.has_graph(graph_name):
    database.delete_graph(graph_name, drop_collections=True) 

  req = requests.post(f"https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/graph-examples/create/{graph_name}", auth=(con['username'], con['password']))
  return req.status_code


---



# New with 3.9 


## Decay Functions
Decay functions calculate a score with a function that decays depending on the distance of a numeric value from a user given origin.

### DECAY_EXP()

Calculate the score for one or multiple values with an exponential function that decays depending on the distance of a numeric value from a user-given origin.

### DECAY_LINEAR()

Calculate the score for one or multiple values with a linear function that decays depending on the distance of a numeric value from a user-given origin.

### DECAY_GAUSS()

Calculate the score for one or multiple values with a Gaussian function that decays depending on the distance of a numeric value from a user-given origin.

In [None]:
results = aql.execute(
    """
    LET exponential = DECAY_EXP(2, 0, 10, 0, 0.2)
    LET linear = DECAY_LINEAR(2, 0, 10, 0, 0.2) 
    LET gaussian = DECAY_GAUSS(2, 0, 10, 0, 0.2)
    RETURN {
      exponential,
      linear,
      gaussian
    }
"""
)
[res for res in results]

## Vector Functions
Added three new vector functions.

COSINE_SIMILARITY()

Return the cosine similarity between x and y.

L1_DISTANCE()

Return the Manhattan distance between x and y.

L2_DISTANCE()

Return the Euclidean distance between x and y.

In [None]:
results = aql.execute(
    """
    LET cosine_similarity = COSINE_SIMILARITY([[0,1,0,1],[1,0,0,1],[1,1,1,0],[0,0,0,1]], [1,1,1,1])
    LET L1 = L1_DISTANCE([[0,1,0,1],[1,0,0,1],[1,1,1,0],[0,0,0,1]], [1,1,1,1])
    LET L2 = L2_DISTANCE([[0,1,0,1],[1,0,0,1],[1,1,1,0],[0,0,0,1]], [1,1,1,1])
    RETURN {
      cosine_similarity,
      L1,
      L2
    }
"""
)
[res for res in results]

Multi-dimensional indexes (experimental)

Edge cache refilling (experimental)

# New with 3.10

# ALL_SHORTEST_PATH

Find all paths of shortest length between a start and target vertex

In [None]:
create_example_graph("kShortestPathsGraph")

In [None]:
create_example_graph("kShortestPathsGraph")

# Using K_SHORTEST_PATHS without path filtering returns ALL paths
results = aql.execute(
    """
    FOR p IN OUTBOUND K_SHORTEST_PATHS 'places/Carlisle' TO 'places/London'
      GRAPH 'kShortestPathsGraph'
    RETURN { places: p.vertices[*].label }
    """
)
[res for res in results]

In [None]:
# Using ALL_SHORTEST_PATHS returns only the list of shortest paths
results = aql.execute(
    """
    FOR p IN OUTBOUND ALL_SHORTEST_PATHS 'places/Carlisle' TO 'places/London'
      GRAPH 'kShortestPathsGraph'
    RETURN { places: p.vertices[*].label }
    """
)
[res for res in results]

# AT LEAST

You can now combine one of the supported comparison operators with the special AT LEAST (<expression>) operator to require an arbitrary number of elements to satisfy the condition to evaluate to true. You can use a static number or calculate it dynamically using an expression:

In [None]:
results = aql.execute(
    """
    LET a = [ 1, 2, 3 ]  AT LEAST (2) IN  [ 2, 3, 4 ]  // true
    LET b = ["foo", "bar"]  AT LEAST (1+1) ==  "foo"   // false
    RETURN [a, b]
"""
)

[res for res in results]

# Question Mark Array Operator

You can use the [? ... ] operator on arrays to check whether the elements fulfill certain criteria, and you can specify how often they should be satisfied. The operator is similar to an inline filter but with an additional length check and it evaluates to true or false.

In [None]:
results = aql.execute(
    """
    LET arr = [0,1,2]
    LET a = LENGTH(arr[*]) > 0
    LET b = arr[?]
    RETURN [a,b]
    """
)
[res for res in results]

In [None]:
results = aql.execute(
    """
    LET arr = [
      {
        "name": "Chris",
        "age": 32
      },
      {
        "name": "Jon",
        "age": 30
      }
    ]
    LET a = LENGTH(arr[* FILTER CURRENT.age > 30]) > 0
    LET b = arr[? 0..2 FILTER CURRENT.age > 30]
    RETURN [a,b]
    """
)

[res for res in results]

## Parallelism for Sharded Graphs (Enterprise Edition)

The 3.10 release supports traversal parallelism for Sharded Graphs, which means that traversals with many start vertices can now run in parallel. An almost linear performance improvement has been achieved, so the parallel processing of threads leads to faster results.

This feature supports all types of graphs - General Graphs, SmartGraphs, EnterpriseGraphs (including Disjoint).

Traversals with many start vertices can now run in parallel. A traversal always starts with one single start vertex and then explores the vertex neighborhood. When you want to explore the neighborhoods of multiple vertices, you now have the option to do multiple operations in parallel.

A few items to note:
* The following sets parallelism to 3, meaning we can execute up to 3 traversals in parallel
* Due to our return statement only returning `name` we also benefit from [projections](https://www.arangodb.com/docs/stable/release-notes-new-features310.html#traversal-projections-enterprise-edition)

In [None]:
create_example_graph("knows_graph")

# With parallelism of 3 we are now able to run our traversals in parallel 
# Our start vertices are persons in the person collection 
results = aql.execute(
    """
    FOR startVertex IN persons
    FOR v,e,p IN 1..1 OUTBOUND startVertex GRAPH "knows_graph" OPTIONS {parallelism: 3}
    RETURN [p.vertices[0].name, 'Knows', p.vertices[1].name]
"""
)
[res for res in results]

AQL functions added to the 3.10 Enterprise Edition:

[OFFSET_INFO()](https://www.arangodb.com/docs/stable/aql/functions-arangosearch.html#offset_info): An ArangoSearch function to get the start offsets and lengths of matches for search highlighting.

[MINHASH()](https://www.arangodb.com/docs/stable/aql/functions-miscellaneous.html#minhash): A new function for locality-sensitive hashing to approximate the Jaccard similarity.

[MINHASH_COUNT()](https://www.arangodb.com/docs/stable/aql/functions-miscellaneous.html#minhash_count): A helper function to calculate the number of hashes (MinHash signature size) needed to not exceed the specified error amount.

[MINHASH_ERROR()](https://www.arangodb.com/docs/stable/aql/functions-miscellaneous.html#minhash_error): A helper function to calculate the error amount based on the number of hashes (MinHash signature size).

[MINHASH_MATCH()](https://www.arangodb.com/docs/stable/aql/functions-arangosearch.html#minhash_match): A new ArangoSearch function to match documents with an approximate Jaccard similarity of at least the specified threshold that are indexed by a View.

AQL functions added to all editions of 3.10:

[SUBSTRING_BYTES()](https://www.arangodb.com/docs/stable/aql/functions-string.html#substring_bytes): A function to get a string subset using a start and length in bytes instead of in number of characters.

[VALUE()](https://www.arangodb.com/docs/stable/aql/functions-document.html#value): A new document function to dynamically get an attribute value of an object, using an array to specify the path.

[KEEP_RECURSIVE()](https://www.arangodb.com/docs/stable/aql/functions-document.html#keep_recursive): A document function to recursively keep attributes from objects/documents, as a counterpart to UNSET_RECURSIVE().

AQL functions changed in 3.10:

[MERGE_RECURSIVE()](https://www.arangodb.com/docs/stable/aql/functions-document.html#merge_recursive): You can now call the function with a single argument instead of at least two. It also accepts an array of objects now, matching the behavior of the MERGE() function.

[EXISTS()](https://www.arangodb.com/docs/stable/aql/functions-arangosearch.html#testing-for-nested-fields): The function supports a new signature EXISTS(doc.attr, "nested") to check whether the specified attribute is indexed as nested field by a View or inverted index (introduced in v3.10.1).