[Q]: Is there a program which can assist in searching for specific keywords in open databases of certain Courts? #303

Jacintha777 · 2022-03-04T08:54:27Z

Contact Details

j.r.k.asarfi@tilburguniversity.edu

Shoot!

I am looking into the case law of national courts and in this regard I have to search for specific keywords in two databases. These databases are: http://www.ttlawcourts.org/index.php/law-library/search-librarys-holdings and http://rechtspraak.sr/
In the first database the specific keywords to search for are: referral, referral jurisdiction, Article 214 RTC, Caribbean Court of Justice, CCJ. The specific keywords for the second database are in Dutch namely: __**verwijzingsprocedure, Herziene Verdrag van Chaguaramas, Caribisch Hof van Justitie, verwijzing naar het Caribisch Hof van Justitie_****".

The expectation is that searching these databases for these specific keywords will result in cases which are relevant for my research.

Looking forward to your reply.

Code of Conduct

I agree to follow this project's Code of Conduct

hannesdatta · 2022-03-04T09:13:41Z

Hi @Jacintha777, thanks a bunch. Could you please provide a bit more detail on how the search is going to be executed?

1) With regard to ttlawcourts:

At http://www.ttlawcourts.org/index.php/law-library/search-librarys-holdings, I do not see a "keyword" field.

Further, does any filter need to be used on the Documents?

Further, please specify how the search results should be saved:

Do you require a list in Excel with these search results?

Or do you require the scraper to download the resulting PDF document?

2) With regard to rechtspraak.sr

It's also not clear where keywords need to be entered. I think somewhere here?
https://rechtspraak.sr/uitspraken-databank/uitgebreid-zoeken/
Further, what information would you like to see captured, exactly? The "text" like this? https://rechtspraak.sr/sru-hvj-2020-6/? In which format?

In formulating this issue, please imagine you are instructing a Research Assistant to strictly follow a particular procedure. Without any "thinking". Just executing a procedure. That way, we can instruct a program to do the same thing. Thanks!

Jacintha777 · 2022-03-04T09:43:07Z

Dear @hannesdatta, with regard to your first point: click on Supreme Court (second on the left under the logo of Judiciary of Trinidad & Tobago) and then on High Court, then you see the search the site option, there you can search for the keywords. I just did with the word 'referral and then you find cases in which referral is highlighted. The same procedure can be followed with regard to "Court of Appeal".

Jacintha777 · 2022-03-04T09:45:03Z

With regard to the filter on the documents: 'judgments' is preferred

Jacintha777 · 2022-03-04T09:49:12Z

Concerning how the search results should be saved: an excel file containing the name of the case and a sentence before and after the keyword to determine whether the keyword is used in the context of e.g. 'referral to the CCJ'.

Jacintha777 · 2022-03-04T09:50:34Z

Although I would appreciate it if the scraper could download the pdf document, if that is possible of course.

Jacintha777 · 2022-03-04T10:02:56Z

With regard to rechtspraak.sr. the search should be conducted in: https://rechtspraak.sr/uitspraken-databank/eenvoudig-zoeken/
The link that you are referring to is a more elaborate search which requires specific information such as case number which makes the search complicated as I do not have that information.

Jacintha777 · 2022-03-04T10:11:03Z

With regard to the format: the "text" like this?https://rechtspraak.sr/sru-hvj-2020-6/ is fine which is helpful to determine the context in which the keyword is used. But if it is possible to highlight the keyword in the document that would be great (if this is possible of course).

Jacintha777 · 2022-03-04T10:15:16Z

@hannesdatta, please let me know if you require further information. Thank you

hannesdatta · 2022-03-10T09:17:38Z

@BilgeKasapoglu , is this something u could handle? I'd say develop it for the first site and we can then check how it performs. Please incest about 2-3 hours for now. MaYbe set up a meeting with Jacinta to clarify any issues.

Woud try beautiful soup first btw. Selenium may be an overkill. Check Tutorials at Odcm.hannesdatta.com for code snippets.

BilgeKasapoglu · 2022-03-11T18:02:19Z

Dear @hannesdatta,

I keep getting "SSLCertVerificationError" when I try to request the URL. Do you know any experience with such an error? Thank you

Best
Bilge

hannesdatta · 2022-03-14T08:41:10Z

@BilgeKasapoglu, did you try to google this error? This search result seems to be relevant. Let me know please.

https://stackoverflow.com/questions/10667960/python-requests-throwing-sslerror

Jacintha777 · 2022-03-14T08:57:59Z

@hannesdatta and @BilgeKasapoglu, thanks and curiously following your updates. I am available to meet on 15 and 16 March so let me know.

hannesdatta · 2022-03-15T11:47:57Z

@BilgeKasapoglu, let us know whether any input is required for working on this.

hannesdatta · 2022-03-15T11:48:24Z

@BilgeKasapoglu, also inform jacintha about expected date of delivery (plus allow some time for me to review the final product).

BilgeKasapoglu · 2022-03-15T15:14:36Z

Dear @hannesdatta and @Jacintha777

I think I can work on this on Thursday if it is okay with you. I can can it to you by Friday noon, @hannesdatta. Thank you

Best
Bilge

Jacintha777 · 2022-03-15T16:26:59Z

Dear @hannesdatta and @Jacintha777

I think I can work on this on Thursday if it is okay with you. I can can it to you by Friday noon, @hannesdatta. Thank you

Best Bilge

Jacintha777 · 2022-03-15T16:30:19Z

Dear @BilgeKasapoglu, that sounds great. I look forward to the results after @hannesdatta has reviewed the final product.
Kind regards,
Jacintha

Jacintha777 · 2022-03-15T16:37:57Z

@hannesdatta and @BilgeKasapoglu, my apologies, I closed this issue by mistake. What I also wanted to comment on: this is for the website of Trinidad and Tobago and I am really pleased to hear from both of you that it can be worked on. I hope you are also successful with the website of Suriname (rechtspraak.sr), which is quite a challenge. Thanks. Kind regards,
Jacintha

BilgeKasapoglu · 2022-03-15T21:14:19Z

Dear @Jacintha777,

would you guide me how to search for the keywords in the second website? Is it through "Zoeken"? Thank you

Best
Bilge

BilgeKasapoglu · 2022-03-15T21:17:43Z

Dear @hannesdatta,

I scraped the first website. Usually, there are less than 20 results. However, "Caribbean Court of Justice" gives 50 results. The scraper only gets the first 20 results. In the past, I came across with such a problem such that the results on separate pages. However, I never understood how to solve it. Would you please help me?

Also, how should I share the code and files with you? For now, I will send them through Microsoft Teams? Thank you

Best
Bilge

Jacintha777 · 2022-03-16T06:33:43Z

Dear Bilge, Thank you for your message. For the second website you can search via "Zoeken'' but I would recommend via https://rechtspraak.sr/uitspraken-databank/eenvoudig-zoeken/ (see also my comments on GitHub 12 days ago). For the second website, the keywords to search for are in Dutch namely: *verwijzingsprocedure, Herziene Verdrag van Chaguaramas, Caribisch Hof van Justitie. verwijzing naar het Caribisch Hof van Justitie.* Should you require more information please let me know. Thank you and looking forward to the results. Kind regards, Jacintha

…

On Tue, Mar 15, 2022 at 10:14 PM Bilge Kasapoğlu ***@***.***> wrote: Dear @Jacintha777 <https://github.com/Jacintha777>, would you guide me how to search for the keywords in the second website? Is it through "Zoeken"? Thank you Best Bilge — Reply to this email directly, view it on GitHub <#303 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AYB6AXKS6OQOLLKACD5IL2DVAD4TNANCNFSM5P452ABQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.***>

hannesdatta · 2022-03-17T12:04:52Z

@BilgeKasapoglu - I checked your Teams message. Thanks a bunch for your work!

Please keep the communication / results etc. on GitHub (all project-related communication needs to be here, not somewhere else).

The main goal here is that @Jacintha777 can run the notebook herself. At Tilburg Science Hub, we don't "DO" the job, but we make/give our colleagues tools so they can do it themselves. Plus we share them online.

Accordingly, please:

annotate the jupyter notebook with markdown cells (e.g., setup, scraper for this, scraper for that)
annotate the jupyter notebook with comments (e.g., this is what happens here, this is how I do this, this is where stuff gets saved)
test the notebook on colab.google.com, because that's probably the most ideal place for @Jacintha777 to run her work ultimately (saves a lot of setup costs)
use functions as much as possible, see https://tilburgsciencehub.com/write/good-code

Let us NOT ship any excel files - Jacinta will have to edit queries in search herself.

Please post your updated notebook here for another round of feedback. Alternatively, post the notebook on gist.github.com.

Old version:
scrapeCourts.ipynb.zip

Jacintha777 · 2022-03-17T12:13:06Z

Dear @hannesdatta, thanks for this update. Please don`t forget that I have zero knowledge on how to use a scraper. Therefore guidance from @BilgeKasapoglu might be necessary. Kind regards, Jacintha

hannesdatta · 2022-03-17T13:38:44Z

@Jacintha777, no worries. Google Colab has a point&click interface & @BilgeKasapoglu can walk you through how to use it (it's really just clicking on it, changing the search query, and waiting for the results to be downloaded). If you get that to run, it's way more useful for you.

Jacintha777 · 2022-03-17T15:15:22Z

@hannesdatta, thanks and looking forward to the session with @BilgeKasapoglu. Kind regards, Jacintha

BilgeKasapoglu · 2022-03-17T19:04:25Z

scrapeCourts.ipynb.zip

@Jacintha777 and @hannesdatta,

Here is the most up-to-date version of the notebook. I also created a version on Google Collab and added you two. I hope you two can see it there now. I must say the second website is difficult to work with because it does not search for the keywords as a whole such that if I search for van huizen, it gives all the results containing "van" and "huizen" even separately. Below, I am putting some additional information for @Jacintha777 to get the class names in each website. Thank you

Best,
Bilge

BilgeKasapoglu · 2022-03-17T19:13:39Z

Dear @Jacintha777 and @hannesdatta,

Below you can find additional information on scraping a website. It is about how to get the class name of the objects that we want to scrape. Let me know if anything is unclear. Thank you

Best
Bilge
scrapingAdditionalInfo.pdf

Jacintha777 · 2022-03-18T07:51:01Z

@BilgeKasapoglu and @hannesdatta, thank you very much.
Today I will not be able to try out the scraping tool, because of various meetings. Therefore, I will try it out in the weekend. I will let you know how it went on Monday and whether I need a session with @BilgeKasapoglu to guide me via a Teams meeting.

@hannesdatta @BilgeKasapoglu, I expected the complications with the second website, because I tried searching the website rechtspraak.sr as well by entering the keywords separately in : eenvoudig zoeken. So I recognize what @BilgeKasapoglu found with the 'van 'and huizen' words. I also did a similar search with the website of Trinidad & Tobago and there some of the keywords delivered results. So I am really looking forward to the results of the scraping tool

Thanks and will update both of you on Monday. Have a good weekend. Kind regards, Jacintha

Jacintha777 · 2022-03-21T08:36:02Z

Dear @BilgeKasapoglu, I opened the link with Notepad and from there I had no idea how to proceed. And I have another question: how do I get access to Google Collab? It would be very helpful if you could guide me through it. In this regard, I would appreciate a session via zoom of Teams. Tomorrow I will be at the university and I am not sure if you also work from there. I am available for a session via zoom or Teams on Wednesday 23 or Thursday 24 or Friday 25 March. Please let me know which date and time is convenient for you. Thank you, Jacintha

BilgeKasapoglu · 2022-03-21T08:40:28Z

Dear @Jacintha777,

Tomorrow i have a meeting with my supervisors at 3PM and I am trying to give them an end result for that meeting. would it be possible to hold the meeting after 16:30 for you? I will be at the university the whole day. Thank you

Best
Bilge

Jacintha777 · 2022-03-21T09:45:15Z

Dear @BilgeKasapoglu, see you tomorrow after 16.30. My office is in the M-building room M312. Good luck with your meeting!

hannesdatta · 2022-03-21T15:12:04Z

Dear @BilgeKasapoglu, please move the scraper code to a repository where we can actually collaborate on the files. See https://github.com/tilburgsciencehub/onboarding/wiki/Workflow. Any feedback required from me at this stage?

BilgeKasapoglu · 2022-03-22T15:08:28Z

Dear @hannesdatta,

here is the repo : https://github.com/tilburgsciencehub/courtScraping

Bilge

hannesdatta · 2022-03-25T15:52:38Z

@BilgeKasapoglu:

implement looping through search results using iterators and limits: https://www.ttlawcourts.org/index.php/advance-search?searchword=court%20of%20appeal&searchphrase=all&limit=50&start=55

@Jacintha777 - please specify a search on the Surinamese site that produces "valid" results.

E.g., we're trying w/

But the search results are quite meaningless.

Can you use these results at all?

The site just produces results with "van"... not really intended, right?

@Jacintha777 please advise how to go ahead here.

BilgeKasapoglu · 2022-03-25T15:59:07Z

Dear @hannesdatta ,

@BilgeKasapoglu is my username. I guess you have been @'ing someone else.

Best
Bilge

hannesdatta · 2022-03-25T16:03:14Z

noted ;)

Jacintha777 · 2022-03-25T16:10:09Z

@hannesdatta and @BilgeKasapoglu, the results with 'van' are indeed useless. I think that the rechtspraak.sr website will not deliver any results. I just tried 'herziene verdrag Chaguaramas' and I get other results not relevant for my research and no results on 'Chaguaramas'. So at least I can say that I tried scraping this website but with no results. many thanks for trying.

@BilgeKasapoglu, with regard to Trinidad and Tobago: do you get the same results of the High Court when searching the Court of Appeal?

BilgeKasapoglu · 2022-04-01T14:18:58Z

Dear @hannesdatta and @Jacintha777,

I have modified and delivered the outcomes to Jacintha last week. We have concluded that the second website was not so workable. For the first website, I created a python code to scrape it. I shared the file with you on a repository and Google Collaborator.

Best
Bilge

Jacintha777 · 2022-04-01T15:45:04Z

Dear @hannes Datta ***@***.***> and @Bilge Kasapoğlu ***@***.***>, Thank you once again. @hannes, I know you are quite busy, but I still would like to enquire about the following: how can the python code find one case and not another? When searching the website of Trinidad and Tobago, the keyword 'referral' delivered the case of *Jhamilly Hadeed* of 2019. However, there is another case of 2021 which should have been detected when searching for the keyword 'referral'. Do you have an idea or a possible explanation why one case is detected and the other not? Thank you and looking forward to your reply. Kind regards, Jacintha

…

On Fri, Apr 1, 2022 at 4:19 PM Bilge Kasapoğlu ***@***.***> wrote: Dear @hannesdatta <https://github.com/hannesdatta> and @Jacintha777 <https://github.com/Jacintha777>, I have modified and delivered the outcomes to Jacintha last week. We have concluded that the second website was not so workable. For the first website, I created a python code to scrape it. I shared the file with you on a repository and Google Collaborator. Best Bilge — Reply to this email directly, view it on GitHub <#303 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AYB6AXKKRNNHU53HZWTCJKLVC4AV5ANCNFSM5P452ABQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Jacintha777 · 2022-04-01T15:52:48Z

Dear @hannesdatta and @BilgeKasapoglu,

Thank you once again.

@hannesdatta, I know you are quite busy, but I still would like to enquire about the following: how can the python code find one case and not another? When searching the website of Trinidad and Tobago, the keyword 'referral' delivered the case of Jhamilly Hadeed of 2019. However, there is another case of 2021 which should have been detected when searching for the keyword 'referral'. Do you have an idea or a possible explanation why one case is detected and the other not?

Thank you and looking forward to your reply.

Kind regards,
Jacintha

hannesdatta · 2022-04-01T16:12:43Z

@BilgeKasapoglu, can you comment/have an idea?

BilgeKasapoglu · 2022-04-01T16:27:22Z

Dear @Jacintha777,

Would you share the exact names of the cases? also, is the search under High Court or Court of Appeal?

Thank you
Bilge

BilgeKasapoglu · 2022-04-01T16:29:01Z

@Jacintha777
also which case should have been showed up in 2021? Thank you

Jacintha777 · 2022-04-04T12:58:57Z

Dear @BilgeKasapoglu and @hannesdatta ,

Thank you, so if I understand correctly, if the word 'referral' is not mentioned in the title or summary of the case on the website then the scraper will not be able to find it. If that is the case then this answers my question. The conclusion that I draw from this is that the scraper is not able to help me detect cases from the website which mention referral because if they are not in the title and case summary (but in the text somewhere) then I will not be able to find such cases. I had hoped that such would be the case which would make it easier for me. Can this be fixed so that it can search everywhere on the website and not only title and summary only?

Thank you once again.

Kind regards,
Jacintha

BilgeKasapoglu · 2022-04-04T13:01:10Z

Dear @Jacintha777,

To my knowledge, I do not know how to fix it. I think the scraper cannot help with this. Maybe @hannesdatta knows more about this but probably it is the faulty design of the website. Thank you

Best
Bilge

hannesdatta · 2022-04-04T15:20:52Z

Hi all,
@Jacintha777, a web scraper captures text that is visible on a website. As the text is not visible on the website, we can't capture it. Our idea was to use the site's search function (right, @BilgeKasapoglu?) to get you the articles, but the search functionality seemed very limited.

My approach would be to use broader search words (which you could change in the Jupyter Notebook) and download all cases, and then use the PDF/text tool we have developed for you earlier to search through this data.

Note that this is a massive undertaking that we can't facilitate at this stage.

What I would say is you try to develop the code from here.

If you want to learn coding in Python, you can also enroll to https://odcm.hannesdatta.com (starting in September) where you will learn python & scraping.

Jacintha777 added the question Further information is requested label Mar 4, 2022

hannesdatta assigned BilgeKasapoglu Mar 10, 2022

hannesdatta added the can be worked on! ready to be worked on! (even self-assigned!) label Mar 14, 2022

Jacintha777 closed this as completed Mar 15, 2022

Jacintha777 reopened this Mar 15, 2022

tilburgsciencehub deleted a comment from Bilge Mar 25, 2022

BilgeKasapoglu closed this as completed Apr 1, 2022

BilgeKasapoglu reopened this Apr 1, 2022

hannesdatta closed this as completed Apr 4, 2022

[Q]: Is there a program which can assist in searching for specific keywords in open databases of certain Courts? #303

[Q]: Is there a program which can assist in searching for specific keywords in open databases of certain Courts? #303

Comments

Jacintha777 commented Mar 4, 2022 • edited Loading

Contact Details

Shoot!

Code of Conduct

hannesdatta commented Mar 4, 2022

Jacintha777 commented Mar 4, 2022 • edited Loading

Jacintha777 commented Mar 4, 2022

Jacintha777 commented Mar 4, 2022

Jacintha777 commented Mar 4, 2022

Jacintha777 commented Mar 4, 2022

Jacintha777 commented Mar 4, 2022

Jacintha777 commented Mar 4, 2022

hannesdatta commented Mar 10, 2022

BilgeKasapoglu commented Mar 11, 2022

hannesdatta commented Mar 14, 2022

Jacintha777 commented Mar 14, 2022 • edited Loading

hannesdatta commented Mar 15, 2022

hannesdatta commented Mar 15, 2022

BilgeKasapoglu commented Mar 15, 2022

Jacintha777 commented Mar 15, 2022

Jacintha777 commented Mar 15, 2022

Jacintha777 commented Mar 15, 2022 • edited Loading

BilgeKasapoglu commented Mar 15, 2022

BilgeKasapoglu commented Mar 15, 2022

Jacintha777 commented Mar 16, 2022 via email

hannesdatta commented Mar 17, 2022 • edited by BilgeKasapoglu Loading

Jacintha777 commented Mar 17, 2022

hannesdatta commented Mar 17, 2022

Jacintha777 commented Mar 17, 2022

BilgeKasapoglu commented Mar 17, 2022

BilgeKasapoglu commented Mar 17, 2022

Jacintha777 commented Mar 18, 2022

Jacintha777 commented Mar 21, 2022

BilgeKasapoglu commented Mar 21, 2022

Jacintha777 commented Mar 21, 2022

hannesdatta commented Mar 21, 2022

BilgeKasapoglu commented Mar 22, 2022

hannesdatta commented Mar 25, 2022 • edited Loading

BilgeKasapoglu commented Mar 25, 2022

hannesdatta commented Mar 25, 2022

Jacintha777 commented Mar 25, 2022

BilgeKasapoglu commented Apr 1, 2022

Jacintha777 commented Apr 1, 2022 via email

Jacintha777 commented Apr 1, 2022

hannesdatta commented Apr 1, 2022

BilgeKasapoglu commented Apr 1, 2022

BilgeKasapoglu commented Apr 1, 2022

Jacintha777 commented Apr 4, 2022

BilgeKasapoglu commented Apr 4, 2022

hannesdatta commented Apr 4, 2022

Jacintha777 commented Mar 4, 2022 •

edited

Loading

Jacintha777 commented Mar 4, 2022 •

edited

Loading

Jacintha777 commented Mar 14, 2022 •

edited

Loading

Jacintha777 commented Mar 15, 2022 •

edited

Loading

hannesdatta commented Mar 17, 2022 •

edited by BilgeKasapoglu

Loading

hannesdatta commented Mar 25, 2022 •

edited

Loading