GitHub - 8ugr4/scrape: a script which takes a file with the list of urls and scrapes each of url with goroutines.

// UPDATE 19.06.24/21.20/: net/url is not enough to scrape tags.

// use net/http and maybe html to parse the html response // to extract charset from

// goal: // input: list of URL's in a text file. (.txt) // --> UTF-8 // output: "url":"" in this format

// ::: WORKING STRUCTURE OF THE PROGRAM ::: // read the file, save the urls.

// http scraping

// use goroutines to scrape every URL

// use channels(buffer with the number of total goroutines)

// while reading with goroutines, use one goroutine to carry the input to the output file

// use another goroutine to convert the taken input from READER goroutines into expected format

// format is: "url":""

// target address URL

// EXAMPLE OUTPUT AT THE MOMENT: (21.40) /*

Process finished with the exit code 0

*/

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
benchmark_scrapeUTF8_test.go		benchmark_scrapeUTF8_test.go
scrapeUTF8.go		scrapeUTF8.go
scrapedFile.csv		scrapedFile.csv

Provide feedback