Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priority of search results when using searchActorTypeahed #1371

Open
myConsciousness opened this issue Jul 22, 2023 · 5 comments
Open

Priority of search results when using searchActorTypeahed #1371

myConsciousness opened this issue Jul 22, 2023 · 5 comments

Comments

@myConsciousness
Copy link
Contributor

Hi,

When we search for @yui.syui.ai, we first enter @yui. as follows, but @yui.syui.ai is sorted as the 6th. As a matter of priority, it's possible that @yui.bsky.social, the handle is very similar to @yui.syui.ai, will be sorted as the first one, but in this example @yui.bsky.social is appear as the fifth one.

スクリーンショット 2023-07-22 23 14 18

Perhaps the results retrieved by displayName are preferred over handle.

export const getUserSearchQuerySimple = (
db: Database,
opts: {
term: string
limit: number
},
) => {
const { ref } = db.db.dynamic
const { term, limit } = opts
// Matching user accounts based on handle
const accountsQb = getMatchingAccountsQb(db, { term })
.orderBy('distance', 'asc')
.limit(limit)
// Matching profiles based on display name
const profilesQb = getMatchingProfilesQb(db, { term })
.orderBy('distance', 'asc')
.limit(limit)
// Combine and paginate result set
return paginate(combineAccountsAndProfilesQb(db, accountsQb, profilesQb), {
limit,
direction: 'asc',
keyset: new SearchKeyset(ref('distance'), ref('actor.did')),
})
}
// Matching user accounts based on handle
const getMatchingAccountsQb = (
db: Database,
opts: { term: string; includeSoftDeleted?: boolean },
) => {
const { ref } = db.db.dynamic
const { term, includeSoftDeleted } = opts
const distanceAccount = distance(term, ref('handle'))
return db.db
.selectFrom('actor')
.if(!includeSoftDeleted, (qb) =>
qb.where(notSoftDeletedClause(ref('actor'))),
)
.where('actor.handle', 'is not', null)
.where(similar(term, ref('handle'))) // Coarse filter engaging trigram index
.where(distanceAccount, '<', getMatchThreshold(term)) // Refines results from trigram index
.select(['actor.did as did', distanceAccount.as('distance')])
}
// Matching profiles based on display name
const getMatchingProfilesQb = (
db: Database,
opts: { term: string; includeSoftDeleted?: boolean },
) => {
const { ref } = db.db.dynamic
const { term, includeSoftDeleted } = opts
const distanceProfile = distance(term, ref('displayName'))
return db.db
.selectFrom('profile')
.innerJoin('actor', 'actor.did', 'profile.creator')
.if(!includeSoftDeleted, (qb) =>
qb.where(notSoftDeletedClause(ref('actor'))),
)
.where('actor.handle', 'is not', null)
.where(similar(term, ref('displayName'))) // Coarse filter engaging trigram index
.where(distanceProfile, '<', getMatchThreshold(term)) // Refines results from trigram index
.select(['profile.creator as did', distanceProfile.as('distance')])
}
// Combine profile and account result sets
const combineAccountsAndProfilesQb = (
db: Database,
accountsQb: AnyQb,
profilesQb: AnyQb,
) => {
// Combine user account and profile results, taking best matches from each
const emptyQb = db.db
.selectFrom('actor')
.where(sql`1 = 0`)
.select([sql.literal('').as('did'), sql<number>`0`.as('distance')])
const resultsQb = db.db
.selectFrom(
emptyQb
.unionAll(sql`${accountsQb}`) // The sql`` is adding parens
.unionAll(sql`${profilesQb}`)
.as('accounts_and_profiles'),
)
.selectAll()
.distinctOn('did') // Per did, take whichever of account and profile distance is best
.orderBy('did')
.orderBy('distance')
return db.db
.selectFrom(resultsQb.as('results'))
.innerJoin('actor', 'actor.did', 'results.did')
}

@mozzius
Copy link
Member

mozzius commented Aug 13, 2023

Another example:

image

image

@henoya
Copy link

henoya commented Aug 20, 2023

It seems that commit b556912 on 2023

-08-08 fixed the search similarity formula related to this Issue, and the issue poster's issue has been fixed, but I found another problem example.

When I try to search for the "@mmm.bsky.social" account in a post or in the search input, typing "@mmm" results in other accounts such as "@Mmmm" coming up on top. This may be unavoidable, but if I type "@mmm.bsky" further, the "@mmm.bsky.social" account comes up as the 6th account, and it appears lower in the list of candidates in the PC Web version, but not in the list of 5 candidates in the mobile version.

Bluesky mmm

If there is a clear separation between words, it would be better to prioritize similarity on a word-by-word basis.

@syui
Copy link

syui commented Sep 30, 2023

User completion was working until a week ago.
However, it appears that user completion is not working again.

@bnewbold
Copy link
Collaborator

bnewbold commented Oct 2, 2023

@syui do you have a more specific example of something broken right now?

We did deploy some pretty large changes to profile ("actor") search recently; the corresponding post search changes haven't gone live yet. I think that most of the issues raised above are at least partially resolved, based on my spot checks right now.

We do know that a very small fraction of accounts are currently not indexed in the profile index, if they do not have a "profile" record in their repository (even an empty record). It is also possible that a handful (dozens?) of accounts have failed to re-index.

@syui
Copy link

syui commented Oct 2, 2023

display name : ai
handle : yui.syui.ai

These may be a matter of priorities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants