Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary Sort ArangoSearch is not flexible, What's other way do with sort in ArangoSearch #20948

Closed
son2408 opened this issue May 18, 2024 · 18 comments
Assignees

Comments

@son2408
Copy link

son2408 commented May 18, 2024

My Environment

  • ArangoDB Version: 3.11.8
  • Deployment Mode: Single Server
  • Deployment Strategy: Manual Start
  • Infrastructure: own
  • Operating System: Ubuntu 20.04
  • Total RAM in your machine: 32Gb
  • Disks in use: SSD

Component, Query & Data

ViewConfig:

{
  "writebufferSizeMax": 33554432,
  "id": "2455388",
  "storedValues": [],
  "name": "view",
  "type": "arangosearch",
  "consolidationPolicy": {
    "type": "tier",
    "segmentsBytesFloor": 2097152,
    "segmentsBytesMax": 5368709120,
    "segmentsMax": 10,
    "segmentsMin": 1,
    "minScore": 0
  },
  "writebufferActive": 0,
  "links": {
    "tinhhinhdangky": {
      "analyzers": [
        "identity"
      ],
      "fields": {},
      "includeAllFields": false,
      "storeValues": "none",
      "trackListPositions": false
    },
    "giaychungnhan": {
      "analyzers": [
        "identity"
      ],
      "fields": {},
      "includeAllFields": false,
      "storeValues": "none",
      "trackListPositions": false
    },
    "dangkyquyen": {
      "analyzers": [
        "identity"
      ],
      "fields": {},
      "includeAllFields": false,
      "storeValues": "none",
      "trackListPositions": false
    },
    "thuadat": {
      "analyzers": [
        "identity"
      ],
      "fields": {},
      "includeAllFields": false,
      "storeValues": "none",
      "trackListPositions": false
    },
    "hosotiepnhan": {
      "analyzers": [
        "identity"
      ],
      "fields": {},
      "includeAllFields": false,
      "storeValues": "none",
      "trackListPositions": false
    }
  },
  "commitIntervalMsec": 1000,
  "consolidationIntervalMsec": 1000,
  "globallyUniqueId": "h189823C8B262/2455388",
  "cleanupIntervalStep": 2,
  "primarySort": [
    {
      "field": "tinhHinhDangKyId",
      "asc": true
    }
  ],
  "primarySortCompression": "lz4",
  "writebufferIdle": 64
}

Problem:
image
My view links many collections. When I add a new collection to this view, I can't add new field sort to current view config. If I remove this view after create new, it will take a lot of time and we can't do anything during save view.
Expected result:
I want add new sort field without create new view, or what's other way work with sort arangosearch ?

@son2408 son2408 changed the title Solution replace when cannot create or remove field sort with primary sort ArangoSearch. Primary Sort ArangoSearch is not flexible, What's other way do with sort in ArangoSearch May 18, 2024
@alexbakharew
Copy link
Contributor

Hello @son2408 !

Unfortunately, you can specify Primary Sort fields/order only during view creation:

You can only set the primarySort option and the related primarySortCompression and primarySortCache options on View creation.

Here is the documentation page: https://docs.arangodb.com/3.11/index-and-search/arangosearch/performance/#primary-sort-order

To solve your problem, you can switch to the recently implemented Inverted Index and Search Alias View: https://docs.arangodb.com/3.11/index-and-search/arangosearch/search-alias-views-reference/#how-to-use-search-alias-views.

With these new functionality one can achieve your desired scenario:

  1. Create an Inverted Index for the new collection with required Primary Sort settings.
  2. Add this index to Search Alias View - now previously added indexes are not affected.
  3. After adding index you can query view.

If you have any further questions feel free to ask them as well!

@son2408
Copy link
Author

son2408 commented May 28, 2024

@alexbakharew thanks you

@dothebart dothebart added the 2 Fixed Resolution label May 28, 2024
@son2408 son2408 reopened this May 28, 2024
@son2408
Copy link
Author

son2408 commented May 28, 2024

Hello @alexbakharew I can't add new index of a collection into search-alias exists. if I upadate indexes property, it's sync all, I don't expect this.

@son2408
Copy link
Author

son2408 commented May 28, 2024

@alexbakharew how to do it ?

@son2408
Copy link
Author

son2408 commented May 28, 2024

I have a search-alias with index "idx_1800260331957649408" with collection "a"
image

@son2408
Copy link
Author

son2408 commented May 28, 2024

Now, I want add new index of collection b into search-alias "r" exists. I don't affect indexes before :(

@son2408
Copy link
Author

son2408 commented May 28, 2024

@alexbakharew by your suggest, I request api update search-alias with this body?

"indexes": [
{
"collection": "a",
"index": "idx_1800260331957649408",
"operation": "string"
},
{
"collection": "b",
"index": "idx_1800288305630150656",
"operation": "string"
}
]

@alexbakharew
Copy link
Contributor

Hello @son2408

if I upadate indexes property, it's sync all, I don't expect this.

Do you mean that the update of the search-alias view with exactly one new index has the same execution time as creating arangosearch view for all collections?

@alexbakharew
Copy link
Contributor

"indexes": [
{
"collection": "a",
"index": "idx_1800260331957649408",
"operation": "string"
},
{
"collection": "b",
"index": "idx_1800288305630150656",
"operation": "string"
}
]

"operation": "string" is incorrect here. https://docs.arangodb.com/3.11/index-and-search/arangosearch/search-alias-views-reference/#view-modification

You can use either "add" or "del". Default value is "add". So in your case you can simply omit it.

@son2408
Copy link
Author

son2408 commented May 28, 2024

I receive a message error, do you explain it for me, thanks ?
image

@alexbakharew
Copy link
Contributor

@son2408 I suspect that in these 2 indexes you have different primary sort orders. So the order should be the same in all indexes.

If you mix directions in the primary sort order, the inverted index cannot be utilized for fully optimizing out a matching SORT operation if you use the inverted index standalone.

@son2408
Copy link
Author

son2408 commented May 28, 2024

@alexbakharew I want create sort order both asc and desc for a field in Inverted Index. Therefore it causes the above error. If only choose one sort order, I think this it's not reasonable.

@son2408
Copy link
Author

son2408 commented May 28, 2024

@alexbakharew. I try create sort order both asc and desc on a field, but i see it only work with asc :(
image
image
image

@alexbakharew alexbakharew self-assigned this May 29, 2024
@alexbakharew
Copy link
Contributor

Hi @son2408!

I want create sort order both asc and desc for a field in Inverted Index. Therefore it causes the above error. If only choose one sort order, I think this it's not reasonable.

Unfortunately, I wasn't able to reproduce this error by specifying different sorting order for the same field. I got this error when I tried to create a view with indexes which have different sorting orders.

Could you please provide collection, indexes and view definitions for more detailed info?

I try create sort order both asc and desc on a field, but i see it only work with asc :(

Under the hood in the index there is no data duplication. It means that we store documents in index only once and apply sorting orders which you have defined in index definition. As a result, your documents in index will be sorted either in ascending or in decreasing order.

@alexbakharew alexbakharew removed the 2 Fixed Resolution label May 29, 2024
@son2408
Copy link
Author

son2408 commented May 31, 2024

Hello @alexbakharew. If a field can sort asc or desc, should I index sort this field by your suggestion?

@son2408
Copy link
Author

son2408 commented May 31, 2024

@alexbakharew In fact, when client use my application, they want view oldest or lastest data by time. I can't force them only once direct sort

@son2408
Copy link
Author

son2408 commented May 31, 2024

@alexbakharew How to index sort a field that I can query sort either in ascending or in decreasing order ? If with a field I create two indexes, one for ascending one for decreasing and if i need index sort for 10 fields i will create 20 indexes. I think this is a bad idea

@alexbakharew
Copy link
Contributor

Hi @son2408!

Unfortunately, it is only possible to gain performance advantage for the order, which is specified in PrimarySort definition.

@son2408 son2408 closed this as completed Jun 4, 2024
@dothebart dothebart added the 2 Solved Resolution label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants