GitHub - uaedevelopers/Image-scraper: data scraper imager scraper from youpoo

README: Web Civil Scraper Automation This document provides instructions on how to set up and run the web scraping script to extract case data from the New York Courts website.

Initial Setup Create the Folder: On your E: drive, create a folder with the following path: E:\GrapeTask\Web_Civil\Final

Place the Files: Ensure all four project files (cases_input_output.xlsm, chrome_debug.bat, scraper_final.py, and cases_input_output.xlsx) are placed inside this newly created folder.

Launch Chrome for Debugging: Right-click on the chrome_debug.bat file (Run as administrator) to open a special instance of the Chrome browser. This browser is required for the script to function correctly.

One-Time Captcha/Turnstile Solve: When the browser opens, go to the following URL: https://iapps.courts.state.ny.us/webcivilLocal/LCSearch?param=I. If you see a "Turnstile" or a "Captcha" security check, please solve it manually. This step is only required once. After solving it, don't close browser

Running the Scraper Open the Excel File: Open the cases_input_output.xlsm file. This file contains the list of Index Numbers that the script will process.

Run the Script: On the Excel sheet, locate and click the "Run Scraper" button.

Wait for Confirmation: Once you click the button, a message box will appear with the text: "Python script is Running Successfully. Please Check cases_input_output.xlsx file for Data.

How the Script Works Input Data: The script reads the list of Index Numbers from the IndexNumbers sheet in the cases_input_output.xlsm file.

Web Scraping: It connects to the web browser you launched earlier, navigates to the website, and automatically fills in the search forms for each Index Number.

Output: All the scraped data is saved to the cases_input_output.xlsx file.

"Not Found" Handling: If the script is unable to find data for a specific Index Number, it will write "Not Found" in the Case Status column and immediately move on to the next one, ensuring the process is efficient and doesn't get stuck.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
civil py.py		civil py.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

uaedevelopers/Image-scraper

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages