It is a technology extracting data from intranet sites by means of dedicated software such as intranet surfing simulators running at a low level of HTTP or integrating an integrated web browser Wikipedia
Or we can simply say that it is a technology that fetches some information from specific sites through specialized programming codes that performs this task
When you do a web scraping for any website, you definitely need to bring important information and not ordinary information such as today's date, for example!
This information is of interest to you or your customers and you need the information to be recent and not outdated That is why it is not a solution to fetch data daily on your own
You can display this information in your own project or even create an API that displays this information to make it available to all developers
You have to learn how to fetch the content of any site and then how to filter the content to make the required information the remaining content
Filtering content is easy One of the most important of these methods:
- Exchanging unwanted content
- Using Methods That Support Regular Expressions (RegEx)
Of course the second method because in the first case if one letter or even a point in the content is changed you will need to modify the change that occurred on the site.