Skip to content

celine-huang/Intro-to-Computer-Crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro to Computer: Crawler

Introduction to Computer, National Taiwan University, Fall 2019, Final Project

Crawl news from NTU CSIE website

Introduction

This project is a web crawler designed to fetch and parse news from the NTU CSIE website. It was developed as the final project for the course Introduction to Computer at National Taiwan University. The project was a collaborative effort between Ying-Hua Lee and me.

For the commit history and pull requests related to our collaboration, please refer to the original repository hosted on Ying-Hua Lee's GitHub (Armychais902): ItC-python-hw. This repository is a duplicate, created solely for ease of accessibility and showcasing the project on my GitHub profile, and does not alter the original code.

Environment

Oracle VirtualBox 6.0.12

Ubuntu 18.04

Python 3.6.9

Packages:

lxml==4.4.1
requests==2.22.0

Collaboration Contribution

Ying-Hua Lee (Armychais902): main.py, crawler.py (line 1-54)

Shih-Ning Huang (celine-huang): args.py, crawler.py (line 55-90)

About

Crawl news from NTU CSIE website.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages