This workbook contains the code for the paper "Diurnal Patterns in the Spread of COVID-19 Misinformation on Twitter within Italy." A preprint is published as http://arxiv.org/abs/2307.11575.
The access to the data necessary for running thes analyses are restricted by Twitter ToC. Requests to perform a reproduction of this work have to be addressed to the 2nd author Riccardo Gallotti.
The file 'code/Italy_clustering.ipynb' gives details and background on the procedure used to separate users into clusters based on their typical activity patterns. It outlines details the identification and removal of users exhibiting suspicious, bot-like activity from subsequent analysis. The clustering procedure has been repeated for verified and unverified users, and identifies significant differences in activity patterns in between the two groups.
The file 'code/Italy_fourier.ipynb' smoothes the curves of average daily activity and ratio of potentially machinated content per cluster extracted in 'code/Italy_clustering.ipynb'. Finally, the file 'code/Italy_analysis.ipnyb' contains the statistical tests and plots used in the paper not covered elsewhere.
The files 'code/Italy_fourier_unverified.ipynb' and 'code/Italy_analysis.ipnyb' repeat the same analyses when considering only unverified clusters. Similarly, the files 'code/Germany_clustering.ipynb', 'code/Germany_fourier.ipynb' and 'Germany_analysis.ipynb' repeat the analysis for data from Germany.
The figures contained in the paper are plotted in the following locations:
- Figure 2 a-b: code/Italy_fourier.ipynb: Activity: Multiple-frequency decomposition
- Figure 2 c-d: code/Italy_fourier.ipynb: Ratio of potentially machinated content: Multiple-frequency decomposition
- Figure 3: code/Italy_analysis.ipynb: Sunlight
- Figure 4: code/Italy_analysis.ipynb: Intercluster Variation: Diurnal Variation
- Supplementary Figure 1: code/Germany_fourier.ipynb: Germany VS Italy
- Supplementary Figure 2: code/Italy_fourier_unverified.ipynb: Unverified VS all
- Table 1: code/Italy_fourier.ipynb: Statistics: Distribution size of ratios of potentially machinated content across clusters
- Table 2: code/Italy_analysis.ipynb: Inter-cluster variation: Correlation of user activity with ratio of potentially machinated content
- Table 3: code/Italy_analysis.ipynb: Sunlight: Statistics
- Table 4: code/Italy_analysis.ipynb: Lockdown and harmful content
- Supplementary Table 2: code/Italy_analysis.ipynb: Corpus, general
- Supplementary Table 3: code/Italy_fourier.ipynb: Activity: Multiple-frequency decomposition
- Supplementary Table 4: code/Italy_fourier.ipynb: Bimodality
- Supplementary Table 5: code/Italy_fourier.ipynb: Activity: Multiple-frequency decomposition
- Supplementary Table 6: code/Italy_analysis.ipynb: Inter-cluster variation: Content types: Smoothed
- Supplementary Table 7: code/Italy_analysis.ipynb: Inter-cluster variation: Content types: Smoothed
- Supplementary Table 8: code/Italy_fourier.ipynb: Ratio of potentially machinated content: Multiple-frequency decomposition
- Supplementary Table 9: code/Italy_clustering.ipynb: Pseudo-Chronotypes: Cluster statistics
- Supplementary Table 10: code/Italy_clustering.ipynb: Pseudo-Chronotypes: Cluster statistics
- Supplementary Table 11: code/Italy_analysis.ipynb: All and unverified
- Supplementary Table 12: code/Germany_clustering.ipynb: Pseudo-Chronotypes: Cluster statistics
- Supplementary Table 13: code/Germany_fourier.ipynb: Activity
- Supplementary Table 14: code/Germany_fourier.ipynb: Activity
- Supplementary Table 15: code/Germany_fourier.ipynb: Ratio of potentially machinated content
- Supplementary Table 16: code/Italy_analysis.ipynb: Inter-cluster variation: Correlation of user activity with ratio of potentially machinated content