Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import_pfr_passing #50

Closed
bhibby opened this issue Jul 8, 2023 · 6 comments
Closed

import_pfr_passing #50

bhibby opened this issue Jul 8, 2023 · 6 comments
Labels
documentation Improvements or additions to documentation

Comments

@bhibby
Copy link

bhibby commented Jul 8, 2023

Hi, does this module work? I received an error: module 'nfl_data_py' has no attribute 'import_pfr_passing'.

looking for RPO info and I believe this has it?

thank you!
b

@alecglen
Copy link
Collaborator

alecglen commented Jul 10, 2023

Hey @bhibby, thanks for the callout! This looks like a documentation mistake - the import_pfr_passing hook was replaced with a more general import_pfr(stat_type) a while back. Give that a try and reply back if you have any issues. I'll get the documentation updated shortly.

Here is the docstring for the updated method:

def import_pfr(s_type, years=None):
    """Import PFR advanced statistics
    
    Args:
        s_type (str): must be one of pass, rec, rush
        years (List[int]): years to return data for, optional
    Returns:
        DataFrame
    """

@alecglen alecglen added the documentation Improvements or additions to documentation label Jul 14, 2023
@alecglen alecglen mentioned this issue Jul 15, 2023
@bhibby
Copy link
Author

bhibby commented Jul 24, 2023

thanks. I'm still not sure if RPO data is included? Doesnt show up in the column list

@bhibby
Copy link
Author

bhibby commented Jul 24, 2023

I also pulled the pfr pass data for 2022 and the only game that shows up is the super bowl.

@alecglen
Copy link
Collaborator

alecglen commented Jul 24, 2023

thanks. I'm still not sure if RPO data is included? Doesnt show up in the column list

I think the reason you're not seeing RPO data is because it is only present at the seasonal level on PFR, but not at the weekly level (which is what gets returned from this method when you specify years). If you just call nfl.import_pfr("pass") without a years arguments, then you'll see the RPO columns returned.

I can definitely see how that is confusing if you're not actively looking at the PFR pages while using the method. We will discuss breaking these into separate _weekly and _seasonal methods to help clarify that the columns will differ.

@alecglen
Copy link
Collaborator

I also pulled the pfr pass data for 2022 and the only game that shows up is the super bowl.

This one is a known issue in our data source nflverse/nflverse-pfr#30. Hopefully that will get fixed soon.

In the meantime, something like this can get the missing data per player. Just please make sure to respect PFR's server.

# stat_type options: "passing", "rushing_and_receiving", "defense"

def scrape_pfr_advanced_2022(name, stat_type):
    pfr_id = nfl.import_pfr("pass").loc[lambda x: x.player == name, "pfr_id"].iloc[0]
    url = f"https://www.pro-football-reference.com/players/{pfr_id[0]}/{pfr_id}/gamelog/2022/advanced/"
    table = pd.read_html(url, attrs={"id": f"advanced_{stat_type}"}, header=1)[0]
    return table.iloc[:-1].rename(columns={"Rk": "Week", "Unnamed: 6": "At"})

@alecglen
Copy link
Collaborator

Original documentation confusion resolved via #51.

Missing 2022 data fixed with the resolution of nflverse/nflverse-pfr#30.

Confusion around seasonal vs weekly PFR data to be addressed via #53.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants