Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract substring after stop word #637

Closed
Parvares opened this issue May 1, 2023 · 9 comments
Closed

Extract substring after stop word #637

Parvares opened this issue May 1, 2023 · 9 comments

Comments

@Parvares
Copy link

Parvares commented May 1, 2023

Hi Mike, may a ask you a question? Given this template URL in ContextSearch:

https://www.bibliotechediroma.it/opac/query/%s?context=tmatm

I need to extract the substrings after these stop-words (determinative articles) at the beginning of the input string (when they are present):
Il, Lo, La, I, Gli, Le, L'

Here are some examples:

La vita sulla terra e il futuro del genere umano > vita sulla terra e il futuro del genere umano
Il nome della rosa > nome della rosa
L'avaro > avaro
La globalizzazione e i suoi oppositori > globalizzazione e i suoi oppositori
I demoni > demoni

I found something similar here.

Thanks very much!

@Parvares Parvares changed the title Extract substring from stop words Extract substring from stop word May 1, 2023
@ssborbis
Copy link
Owner

ssborbis commented May 2, 2023

if the examples you give are contained in the search string %s you can use the Modify Search Terms field in the search engine edit modal to replace those leading strings.

/regex/replacement/[giym]

Something like ( untested )
/^(Il |Lo |La |I |Gli |Le |L')//g

mind the spaces, and check https://regex101.com/ for reference

@Parvares
Copy link
Author

Parvares commented May 3, 2023

/^(Il |Lo |La |I |Gli |Le |L')//g

Thanks Mike, it doesn't seem working well...

@ssborbis
Copy link
Owner

ssborbis commented May 3, 2023

/^(Il |Lo |La |I |Gli |Le |L')//g
Thanks Mike, it doesn't seem working...

What is your search string? It seems to be working for me

@Parvares
Copy link
Author

Parvares commented May 3, 2023

Strangely it works from some website (from github), but not from all ones: not from google search for example.

@ssborbis
Copy link
Owner

ssborbis commented May 3, 2023

Could you clarify a little.

When you say it's not working "from" a website, are you saying if you select text on those particular websites ( google for instance ) and search through this addon, using the engine with the template https://www.bibliotechediroma.it/opac/query/%s?context=tmatm, the search does not work / the search terms are not modified?

@Parvares
Copy link
Author

Parvares commented May 3, 2023

Yes, I mean this.

@ssborbis
Copy link
Owner

ssborbis commented May 3, 2023

Is the problem due to case sensitivity or a leading space?

@Parvares
Copy link
Author

Parvares commented May 4, 2023

Correct, that was the problem, now it works perfectly, thank again , Mike!

/^(Il |Lo |La |I |Gli |Le |L')//gi

@Parvares Parvares changed the title Extract substring from stop word Extract substring after stop word May 4, 2023
@ssborbis
Copy link
Owner

ssborbis commented May 4, 2023

No prob. Cheers!

@ssborbis ssborbis closed this as completed May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants