getting an error on line 95 #1

alias-noa · 2020-12-31T03:19:31Z

Traceback (most recent call last):
File "C:/Users/Noa/PycharmProjects/sec_scraper_master/scraper.py", line 95, in
s = s + "/" + newLink
NameError: name 'newLink' is not defined

All I did was try to run it on TEVA instead of AAPL

alias-noa · 2020-12-31T03:20:32Z

What is the proper way to run this over several stocks? I just changed line 44 so maybe that's why I'm getting this error.

alias-noa · 2020-12-31T03:26:29Z

Actually how do I even run this thing? I thought I was supposed to run scraper.py....but I'm thinking that's not the correct way. There's on main.py so how do I run it?

alias-noa · 2020-12-31T03:33:12Z

Tried running multi and got a ton of crazy errors...

hmcguinn · 2020-12-31T03:50:17Z

Hey @alias-noa! This repo hasn't exactly been in production-shape :) I've just worked around the errors and don't have them pushed I think. Would you be able to copy the errors you received? I'll clean up the repo and add another comment in a little bit.

Glad you found the repo useful enough to give it a shot!

hmcguinn · 2020-12-31T04:14:02Z

A little bit more detailed comment on usage:

The scraper is set up as a shell script-- the file I use to run it is /multiThreading/multi.py. Multi.py reads in a list of CIK files from /multiThreading/cik.csv. If you need something to map between CIKs and tickers you can find that here.

From there, the scraper searches through the filings for a company (viewable here). As of now, it is configured to only grab Form 3 and Form 4s (Initial Statement of Beneficial Ownership of Securities and Statement of Changes in Beneficial Ownership). That code can be found on lines 84-95 of /multiThreading/getList.py.

The code to actually grab info from the filings in XML form is in /multiThreading/runScraper.py. Currently, it's limited in what it grabs but can be configured easily to grab whatever you want from the filing. The scraper stores all the filings associated with a company in a pandas dataframe before writing it out to an excel file.

Hope that helps to shed a little bit more light on what the code does! It's not exactly the most readable thing,,, I'll get around to cleaning it up at some point hopefully.

I also went ahead and made a couple changes to the repo. It should work after a pull now.

Thanks for giving it a try!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting an error on line 95 #1

getting an error on line 95 #1

alias-noa commented Dec 31, 2020

alias-noa commented Dec 31, 2020

alias-noa commented Dec 31, 2020

alias-noa commented Dec 31, 2020

hmcguinn commented Dec 31, 2020 •

edited

Loading

hmcguinn commented Dec 31, 2020

getting an error on line 95 #1

getting an error on line 95 #1

Comments

alias-noa commented Dec 31, 2020

alias-noa commented Dec 31, 2020

alias-noa commented Dec 31, 2020

alias-noa commented Dec 31, 2020

hmcguinn commented Dec 31, 2020 • edited Loading

hmcguinn commented Dec 31, 2020

hmcguinn commented Dec 31, 2020 •

edited

Loading