Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add professor courses informations #833

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
460 changes: 460 additions & 0 deletions rpi_data/modules/Eric Rutledge _ Faculty.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,026 changes: 1,026 additions & 0 deletions rpi_data/modules/Eric Rutledge _ Faculty_files/css2

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

659 changes: 659 additions & 0 deletions rpi_data/modules/Eric Rutledge _ Faculty_files/js

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
545 changes: 545 additions & 0 deletions rpi_data/modules/Kathleen Galloway _ Faculty.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,026 changes: 1,026 additions & 0 deletions rpi_data/modules/Kathleen Galloway _ Faculty_files/css2

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

659 changes: 659 additions & 0 deletions rpi_data/modules/Kathleen Galloway _ Faculty_files/js

Large diffs are not rendered by default.

Large diffs are not rendered by default.

35 changes: 35 additions & 0 deletions rpi_data/modules/professor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import json
from bs4 import BeautifulSoup
import requests
with open('Professors.json') as f:
data =json.load(f)
def scrape_info(url):
response =requests.get(url)
soup = BeautifulSoup(response.content,'html.parser')
link=soup.find('a',class_='nav-link',text='Teaching')

if link:
name=link.text
print(name)
else:
print("not found")
for professor in data:
name = professor['Name']
title = professor['Title']
email = professor['Email']
phone =professor['Phone']
department = professor['Department']
portfolio = professor['Portfolio']
profile = professor['Profile Page']
if profile:
classes = scrape_info(profile)
professor['Teaching']=classes
else:
professor['Teaching']='Not available'
'''
save the updated data to new json file
'''
with open('Professors.json','w')as f:
json.dump(data,f,indent=4)


46 changes: 46 additions & 0 deletions rpi_data/professor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import json
from bs4 import BeautifulSoup
import requests
import re

with open('Professors.json') as f:
data =json.load(f)
def scrape_info(url):
response =requests.get(url)
soup = BeautifulSoup(response.content,'html.parser')
titles_list = soup.select('.doc-title.serif a')
titles = [title.text.strip() for title in titles_list]
print(titles)
if not link:
return None
teaching_url = link['href'] if link.has_attr('href') else None
if not teaching_url:
return None
response = requests.get(teaching_url)
soup = BeautifulSoup(response.content, 'html.parser')
# extract the classes. The exact method will depend on the structure of the webpage.
classes_list = soup.find_all('li', class_='class-item')
#classes = [item.text for item in classes_list]
pattern = re.compile(r'[A-Z]+\s\d{4} - [A-Za-z\s]+')
classes = [re.search(pattern, item.text).group() for item in classes_list if re.search(pattern, item.text)]
return classes
for professor in data:
name = professor['Name']
title = professor['Title']
email = professor['Email']
phone =professor['Phone']
department = professor['Department']
portfolio=professor['Portfolio']
profile = professor['Profile Page']
if profile:
classes = scrape_info(profile)
professor['Teaching'] = classes if classes else 'Not available'
else:
professor['Teaching'] = 'Not available'
'''
save the updated data to new json file
'''
with open('Professors.json','w')as f:
json.dump(data,f,indent=4)


Loading
Loading