Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal event missing for some dossiers #49

Open
ghxm opened this issue May 22, 2023 · 9 comments
Open

Proposal event missing for some dossiers #49

ghxm opened this issue May 22, 2023 · 9 comments

Comments

@ghxm
Copy link

ghxm commented May 22, 2023

The 'Legislative proposal published' event seems to be missing from the dumps for some dossiers.

I came across two cases so far:

https://oeil.secure.europarl.europa.eu/oeil/popups/ficheprocedure.do?reference=2014/0285(COD)&l=en
https://oeil.secure.europarl.europa.eu/oeil//popups/ficheprocedure.do?reference=2003/0262(COD)&l=en

When scraping these pages manually using scrapers.dossier.scrape(), all events seem to be included in the resulting dict but not in the ep_dosiers.json data dump (2022-05-22).

@stef
Copy link
Member

stef commented May 23, 2023

interesting find! will investigate the reason and hopefully fix this asap!

@stef
Copy link
Member

stef commented May 23, 2023

hah, yeah, those are old dossiers, probably scraped with an old version of the scraper that did not capture this data, or this data was not available at the time of scraping. normally we only scrape new docs, and only at special occasions do we scrape really everything. to fix this i need to run a full re-scrape of everything, if this is not urgent, then trigger me during the summer break of the EP and i'll trigger a full rescrape of everything.

@ghxm
Copy link
Author

ghxm commented May 24, 2023

That's good to know -- I will get back to you in the summer break!

Re. scraping new cases only, I assume this includes pending dossiers in addition to new dossiers?

@stef
Copy link
Member

stef commented May 24, 2023

Re. scraping new cases only, I assume this includes pending dossiers in addition to new dossiers?

only new and unclosed dossiers, yes.

@ghxm
Copy link
Author

ghxm commented Aug 1, 2023

@stef Would now be a good time to re-scrape? :)

@stef
Copy link
Member

stef commented Aug 4, 2023

done. if you confirm, please close this issue.

@ghxm
Copy link
Author

ghxm commented Aug 7, 2023

Thank you!

I did a quick scan through the 2023-08-07 ep_dossiers.json dump for 2014/0285(COD) and could not find the "Legislative proposal published" event:

    {
        "meta":
        {
            "source": "https://oeil.secure.europarl.europa.eu/oeil//popups/ficheprocedure.do?reference=2016/2780(RSP)&l=en",
            "updated": "2021-12-18T16:17:41"
        },
        "procedure":
        {
            "reference": "2016/2780(RSP)",
            "title": "Multiannual plan for the stocks of cod, herring and sprat in the Baltic Sea and the fisheries exploiting those stocks",
            "subject":
            {
                "3.15.01": "Fish stocks, conservation of fishery resources",
                "3.15.04": "Management of fisheries, fisheries, fishing grounds"
            },
            "geographical_area":
            [
                "Baltic Sea area"
            ],
            "type": "RSP - Resolutions on topical subjects",
            "subtype": "Resolution on statement",
            "Other legal basis": "Rules of Procedure EP 132-p2",
            "stage_reached": "Procedure completed"
        },
        "commission":
        [
            {
                "body": "EC",
                "dg": "Maritime Affairs and Fisheries",
                "commissioner": "VELLA Karmenu"
            }
        ],
        "events":
        [
            {
                "date": "2016-06-22T00:00:00",
                "type": "Debate in Parliament",
                "body": "EP",
                "docs":
                [
                    {
                        "url": "https://www.europarl.europa.eu/doceo/document/CRE-8-2016-06-22-TOC_EN.html",
                        "title": "Debate in Parliament"
                    }
                ]
            },
            {
                "date": "2016-06-22T00:00:00",
                "type": "End of procedure in Parliament",
                "body": "EP"
            }
        ],
        "changes":
        {
            "2016-06-10T02:14:08":
            [
                {
                    "data":
                    [
                        {
                            "date": "2016-06-22T00:00:00",
                            "body": "EP",
                            "type": "Indicative plenary sitting date, 1st reading/single reading"
                        }
                    ],
                    "type": "added",
                    "path":
                    [
                        "activities"
                    ]
                },
                {
                    "data":
                    [],
                    "type": "added",
                    "path":
                    [
                        "other"
                    ]
                },
                {
                    "data":
                    [],
                    "type": "added",
                    "path":
                    [
                        "committees"
                    ]
                },
                {
                    "data":
                    {},
                    "type": "added",
                    "path":
                    [
                        "links"
                    ]
                },
                {
                    "data":
                    {
                        "reference": "2016/2780(RSP)",
                        "title": "Multiannual plan for the stocks of cod, herring and sprat in the Baltic Sea and the fisheries exploiting those stocks",
                        "geographical_area":
                        [
                            "Baltic Sea area"
                        ],
                        "stage_reached": "Awaiting Parliament 1st reading / single reading / budget 1st stage",
                        "subtype": "Resolution on statements",
                        "Modified legal basis": "Rules of Procedure of the European Parliament EP 123-p2",
                        "type": "RSP - Resolutions on topical subjects",
                        "subject":
                        [
                            "3.15.01 Fish stocks, conservation of fishery resources",
                            "3.15.04 Management of fisheries, fisheries, fishing grounds"
                        ]
                    },
                    "type": "added",
                    "path":
                    [
                        "procedure"
                    ]
                }
            ],
            "2016-06-14T01:24:31":
            [
                {
                    "data":
                    {
                        "body": "EC",
                        "dg":
                        {
                            "url": "http://ec.europa.eu/dgs/maritimeaffairs_fisheries/",
                            "title": "Maritime Affairs and Fisheries"
                        },
                        "commissioner": "VELLA Karmenu"
                    },
                    "type": "added",
                    "path":
                    [
                        "other",
                        0
                    ]
                }
            ],
            "2016-06-24T01:13:38":
            [
                {
                    "type": "changed",
                    "data":
                    [
                        "Debate in plenary scheduled",
                        "Debate in Parliament"
                    ],
                    "path":
                    [
                        "activities",
                        0,
                        "type"
                    ]
                },
                {
                    "type": "changed",
                    "data":
                    [
                        "Awaiting Parliament 1st reading / single reading / budget 1st stage",
                        "Procedure completed"
                    ],
                    "path":
                    [
                        "procedure",
                        "stage_reached"
                    ]
                }
            ],
            "2016-06-11T01:28:55":
            [
                {
                    "type": "changed",
                    "data":
                    [
                        "Indicative plenary sitting date, 1st reading/single reading",
                        "Debate in plenary scheduled"
                    ],
                    "path":
                    [
                        "activities",
                        0,
                        "type"
                    ]
                }
            ],
            "2019-07-06T10:11:14":
            [
                {
                    "type": "changed",
                    "path":
                    [
                        "procedure",
                        "subject"
                    ],
                    "data":
                    [
                        [
                            "3.15.01 Fish stocks, conservation of fishery resources",
                            "3.15.04 Management of fisheries, fisheries, fishing grounds"
                        ],
                        {
                            "3.15.01": "Fish stocks, conservation of fishery resources",
                            "3.15.04": "Management of fisheries, fisheries, fishing grounds"
                        }
                    ]
                },
                {
                    "type": "changed",
                    "path":
                    [
                        "procedure",
                        "Modified legal basis"
                    ],
                    "data":
                    [
                        "Rules of Procedure of the European Parliament EP 123-p2",
                        "Rules of Procedure EP 123-p2"
                    ]
                },
                {
                    "type": "changed",
                    "path":
                    [
                        "procedure",
                        "subtype"
                    ],
                    "data":
                    [
                        "Resolution on statements",
                        "Resolution on statement"
                    ]
                },
                {
                    "type": "deleted",
                    "path":
                    [
                        "other"
                    ],
                    "data":
                    [
                        {
                            "body": "EC",
                            "dg":
                            {
                                "url": "http://ec.europa.eu/dgs/maritimeaffairs_fisheries/",
                                "title": "Maritime Affairs and Fisheries"
                            },
                            "commissioner": "VELLA Karmenu"
                        }
                    ]
                },
                {
                    "type": "added",
                    "path":
                    [
                        "events"
                    ],
                    "data":
                    [
                        {
                            "date": "2016-06-22T00:00:00",
                            "type": "Debate in Parliament",
                            "body": "EP",
                            "docs":
                            [
                                {
                                    "url": "http://www.europarl.europa.eu/sides/getDoc.do?secondRef=TOC&language=EN&reference=20160622&type=CRE",
                                    "title": "Debate in Parliament"
                                }
                            ]
                        },
                        {
                            "date": "2016-06-22T00:00:00",
                            "type": "End of procedure in Parliament",
                            "body": "EP"
                        }
                    ]
                },
                {
                    "type": "deleted",
                    "path":
                    [
                        "links"
                    ],
                    "data":
                    {}
                },
                {
                    "type": "added",
                    "path":
                    [
                        "commission"
                    ],
                    "data":
                    [
                        {
                            "body": "EC",
                            "dg": "Maritime Affairs and Fisheries",
                            "commissioner": "VELLA Karmenu"
                        }
                    ]
                },
                {
                    "type": "deleted",
                    "path":
                    [
                        "activities"
                    ],
                    "data":
                    [
                        {
                            "date": "2016-06-22T00:00:00",
                            "body": "EP",
                            "type": "Debate in Parliament"
                        }
                    ]
                },
                {
                    "type": "deleted",
                    "path":
                    [
                        "committees"
                    ],
                    "data":
                    []
                }
            ],
            "2019-07-12T19:52:48":
            [
                {
                    "type": "changed",
                    "path":
                    [
                        "procedure",
                        "Modified legal basis"
                    ],
                    "data":
                    [
                        "Rules of Procedure EP 123-p2",
                        "Rules of Procedure EP 132-p2"
                    ]
                }
            ],
            "2021-12-18T16:17:41":
            [
                {
                    "type": "changed",
                    "path":
                    [
                        "events",
                        0,
                        "docs",
                        0,
                        "url"
                    ],
                    "data":
                    [
                        "http://www.europarl.europa.eu/sides/getDoc.do?secondRef=TOC&language=EN&reference=20160622&type=CRE",
                        "https://www.europarl.europa.eu/doceo/document/CRE-8-2016-06-22-TOC_EN.html"
                    ]
                },
                {
                    "type": "deleted",
                    "path":
                    [
                        "procedure",
                        "Modified legal basis"
                    ],
                    "data": "Rules of Procedure EP 132-p2"
                },
                {
                    "type": "added",
                    "path":
                    [
                        "procedure",
                        "Other legal basis"
                    ],
                    "data": "Rules of Procedure EP 132-p2"
                }
            ]
        }
    }
    ```

@ghxm
Copy link
Author

ghxm commented Jan 9, 2024

The problem seems to persist in recent dumps (with increasing tendency). I've looked into 2014/0285 as an example and it seems like the events gets changed (added / deleted) a couple of times throughout collection history and is finally deleted on 2021-12-18T16:26:21. Am i correct to assume that the dumps represent the latest state of the respective item, i.e. all changes are applied?

@stef
Copy link
Member

stef commented Jan 9, 2024

if it is added/deleted repeatedly, then that means this is a problem with the webserver(s) of the european parliament, possibly some load-balancing synchronization problem. we also have the problem, that some server is misconfigured and returns french language output instead of english language output. if we can find out how to reproduce this, and indeed point at the european parliament we could raise this issue with them...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants