Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allegheny, PA absentee votes, second digit #27

Open
ghost opened this issue Nov 9, 2020 · 24 comments
Open

Allegheny, PA absentee votes, second digit #27

ghost opened this issue Nov 9, 2020 · 24 comments

Comments

@ghost
Copy link

ghost commented Nov 9, 2020

I made (and corrected) a quick analysis of second digits for absentee votes only in Allegheny, PA.

Allegheny absentee votes - second digit

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Another quick diagram from Allegheny, PA, this time it's % of total for Biden/Harris, all votes (not only absentee votes).

image

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Another diagram showing the same information

image

Although this is not Benford's Law related, it would be interesting to see if the same pattern holds for similar counties.

@charlesmartin14
Copy link

charlesmartin14 commented Nov 9, 2020

@testes-t Can you add a red line, showing the expected second-digit distribution, to the bar plots above

This may help (if you are in python): https://github.com/milcent/benford_py/blob/master/Demo.ipynb

@charlesmartin14
Copy link

charlesmartin14 commented Nov 9, 2020

This is what I found for the second digit data for Allegheny, Absentee, using the package above

Screen Shot 2020-11-08 at 9 21 56 PM

Screen Shot 2020-11-08 at 9 18 55 PM

Biden
Screen Shot 2020-11-08 at 9 15 56 PM

Trump (the 0 data point looks off ?)
Screen Shot 2020-11-08 at 9 17 30 PM

Other than the Trump 0 data being weird (maybe this package requires a special format for the data, or has a bug ?), the data looks reasonable

@ghost
Copy link
Author

ghost commented Nov 9, 2020

I also had too many 0 digits first, and was wondering what on Earth was going on. However, it turned out to be a result of not having excluded Trump's many one-digit results, so you need to fix that bit in the Trump diagram above.

(Edit:) Biden also had a few one-digit results, so his number of zeros is likely a little lower than in your diagram once you fix it.

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Chicago, total (not only absentee):
image

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Fulton (Atlanta) - total, not only absentee:

image

There's seems to be more people in each precinct for Fulton.

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Comment: Allegheny (Pittsburgh), Fulton (Atlanta) and Chicago have a "tail" starting at about 90-95%. I am not saying that the tail shouldn't be there, but it's a little strange.

@charlesmartin14
Copy link

@testes-t I'm having some trouble interpreting the plots..what is the y-axis ? Can you walk us through it ? thanks

@ghost
Copy link
Author

ghost commented Nov 9, 2020

@charlesmartin14

A smooth normal distribution would have ~sigmoid shape. Flat curve = low frequency; Steep curve = high frequency. The y-axis shows the cumulative number of wards with a Biden vote less than the percentage shown in the x-axis.

So for Fulton, 80% Biden is less commonplace than 95% Biden. It's clearly not a normal distribution. Hypothetically, that could be due to wards being either rural or urban, and only rarely something inbetween, so that the diagram ends up looking like two different normal distributions with expected values 50% and 95%, respectively, that have been added together. I have not seen any literature that have suggested this pattern to be indicative of fraud, but it's interesting nonetheless.

@ghost
Copy link
Author

ghost commented Nov 9, 2020

For reference, here is the same diagram for Minnesota. It looks more like a sigmoid:

image

Above I seem to have explained the y-axis in a somewhat convoluted manner. The blue lines are the Biden % bars for each precinct (this time an ad-hoc precinct number is retained as text by the y-axis)

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Same plot for precincts in Hennepin County (i.e. Minneapolis):
image

We see the same non-sigmoid shape as in the other cities.

@ghost
Copy link
Author

ghost commented Nov 9, 2020

I thought it would be interesting to check a city which is not in a battleground state. The pattern is not seen in Orleans Parish (part of New Orleans). But here, note that early votes are not included due to lack of data:

image

Note that I could be making errors all along, none of my charts have been "verified" so to speak.

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Election day votes and absentee votes for Allegheny, when seen in isolation, both seem to follow more or less a sigmoid pattern:

image

image

@ghost
Copy link
Author

ghost commented Nov 9, 2020

Note that my Allegheny charts are based on the file in this project, I did not collect it myself. I assume that the data is not incomplete.

@CoolOppo
Copy link

CoolOppo commented Nov 9, 2020

Another quick diagram from Allegheny, PA, this time it's % of total for Biden/Harris, all votes (not only absentee votes).

I am not understanding this chart at all. What is the Y-axis? And why are things binned the way they are on the x-axis? I am very confused

@ghost
Copy link
Author

ghost commented Nov 9, 2020

I am not understanding this chart at all. What is the Y-axis? And why are things binned the way they are on the x-axis? I am very confused

It's a large collection of horizontal blue bars that represent Biden's vote share in each precinct. So the y-axis just counts up from precinct 1 to N as sorted by vote share.

@CoolOppo
Copy link

CoolOppo commented Nov 9, 2020

I see. So if I'm getting this correct, ~225 precincts had a vote share of 48%-53% for Biden in Allegheny?

@ghost
Copy link
Author

ghost commented Nov 9, 2020

I see. So if I'm getting this correct, ~225 precincts had a vote share of 48%-53% for Biden in Allegheny?

Edit: I now see that you were talking about the histogram in the second comment from the top. Yes, you got it right.

@ghost ghost mentioned this issue Nov 9, 2020
@ghost
Copy link
Author

ghost commented Nov 9, 2020

So from the above, I hypothesised that absentee votes in Fulton, GA could show some kind of interesting pattern, so I created the following chart:
image

It's smooth; there doesn't seem to be anything suspicious here. I have spent quite a few hours investigating this now and didn't really find any smoking gun anywhere, so I'll end my investigation here.

@iraykhel
Copy link

iraykhel commented Nov 9, 2020

Thank you for a detailed analysis. Do you think you can create a chart like "Biden/Harris share of votes" like in the second comment, but for 2016 election instead, so we can see how it compares? The non-decreasing tail from 70% to 97% seems iffy, I was wondering if that was also present in 2016 election.

@ghost
Copy link
Author

ghost commented Nov 10, 2020

So, there are rumours that 130,000 invalid votes have been cast in Fulton County (Atlanta). Could be fake news, I never saw this news website before, so hard to tell: https://rfangle.com/election/breaking-132000-ballots-in-georgia-likely-ineligible/

The interesting thing is that by far the most strange chart of all I have uploaded is the one I cite below, from precisely Fulton. The chart should normally take on a sigmoid shape (like Minnesota above), but simply doesn't. So it's a funny coincidence. If feasible, you could try to analyse the deviation from normal/chi squared/poisson/whatever further.

Fulton (Atlanta) - total, not only absentee:

image

@justin-winter
Copy link

One thing you might check on Fulton is that it is a really weird shape (formed from 2 other bankrupt counties in the 1930s), so it really is 3+ distinct areas with very uneven economics. It has the richest homes and best schools in mid to north Fulton and some of the poorest neighborhoods and worst schools in mid to south Fulton. So in terms of the chart, that might actually be a bimodal or trimodal distribution.

So, there are rumours that 130,000 invalid votes have been cast in Fulton County (Atlanta). Could be fake news, I never saw this news website before, so hard to tell: https://rfangle.com/election/breaking-132000-ballots-in-georgia-likely-ineligible/

The interesting thing is that by far the most strange chart of all I have uploaded is the one I cite below, from precisely Fulton. The chart should normally take on a sigmoid shape (like Minnesota above), but simply doesn't. So it's a funny coincidence. If feasible, you could try to analyse the deviation from normal/chi squared/poisson/whatever further.

Fulton (Atlanta) - total, not only absentee:
image

@ghost
Copy link
Author

ghost commented Nov 10, 2020

It's not anywhere close to Gaussian. Why are there so few precincts at the median, around 80%? And why do we see this pattern in Fulton, but not in Orleans?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants