LDAVAE

is an automated fake news detection method that incorporates two types of features from two models:

1. VAE: Deep neural embedding.

Motivation: Efficiency
Provides a lower dimensional semantic representation of the news article.
Provides a lower dimensional semantic representation of the news article.

2. LDA: Probabilistic topic modelling.

Motivation: Interpretibility
Provides topic-based features
Provides a lower dimensional semantic representation of the news article.

Contribution:

(1) We provide interpretability by incorporating Bayesian topic modelling and inferring topic compositions in news articles as added features for classification.

(2) Our model works in the data scarcity scenario where only textual content is available.

(3) We keep our model efficient by coupling a deep architecture (VAE) to LDA.

LDAVAE Notations:

$\mathcal{D}$: dataset,
$\mathcal{D}_{tr}$: training set,
$\mathcal{D}_{te}$: test set
$N$: Number of samples, indexed by $i$.
$V$: The set of vocabulary detected by word2vec.
$n_f$: Number of latent features obtained from encoder.
$w$: word2vec dimension.
$L = \max {l_i:i = 1,\dots, N }$
$l_i$: Length of sample $i$ (number of words).
$t_i^{(j)}$: Word $j$ in sample $i$.
$\lambda_1$: Regularization parameter ($=0.05$).
$\lambda_2$: Regularization parameter ($=0.3$).
$K$: Number of topics.

VAE:

VAE Structure:

Encoder:

Layer	Output Shape	Param #	Other Setting
Input	[(None, $L$)]	0
Embedding	(None, $L$, $w$)	$\|V\| \times w $	Non-trainable (word2vec)
Bi. LSTM	(None, $L$, $2n_f$)	$8n_f(w+n_f+1) $	activation='tanh'
Bi. LSTM	(None, $2n_f$)	$8n_f(3n_f+1)$	activation='tanh'
Dense	(None, $n_f$)	$n_f(2n_f+1)$	activation='tanh'
Dense ($h$)	(None, $n_f$)	$n_f(n_f+1)$	activation='tanh'
Sampling	(None, $n_f$)	0	$= \mu_h + \epsilon \exp(\sigma^2_h/2) $

Decoder:

Layer	Output Shape	Param #	Other Setting
Input	[(None, $n_f$)]	0
Dense	(None, $n_f$)	$n_f(n_f+1)$	activation='tanh'
Repeat Vector	(None, $L$, $n_f$)	0
LSTM	(None, $L$, $n_f$)	$4n_f(2n_f+1)$	activation='tanh'
LSTM	(None, $L$, $n_f$)	$4n_f(2n_f+1)$	activation='tanh'
Time Dist.	(None, $L$, $\|V\|$)	$\|V\|(n_f+1)$	activation='softmax'

Classifier:

Layer	Output Shape	Param #	Other Setting
Input	[(None, $n_f$)]	0
Dense	(None, $2n_f$)	$2n_f(n_f+1)$	activation='tanh'
Dense	(None, $n_f$)	$n_f(2n_f+1)$	activation='tanh'
Output	(None, 1)	$n_f+1$	activation='sigmoid'

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Examples/Covid		Examples/Covid
LDAVAE		LDAVAE
README.md		README.md
classifiers.py		classifiers.py
interpretability_plot.ipynb		interpretability_plot.ipynb
main.py		main.py
plots.py		plots.py
preprocessing.py		preprocessing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples/Covid

Examples/Covid

LDAVAE

LDAVAE

README.md

README.md

classifiers.py

classifiers.py

interpretability_plot.ipynb

interpretability_plot.ipynb

main.py

main.py

plots.py

plots.py

preprocessing.py

preprocessing.py

Repository files navigation

LDAVAE

is an automated fake news detection method that incorporates two types of features from two models:

1. VAE: Deep neural embedding.

2. LDA: Probabilistic topic modelling.

Contribution:

LDAVAE Notations:

VAE Structure:

Encoder:

Decoder:

Classifier:

About

Releases

Packages

Languages

Marjan-Hosseini/LDAVAE

Folders and files

Latest commit

History

Repository files navigation

LDAVAE

is an automated fake news detection method that incorporates two types of features from two models:

1. VAE: Deep neural embedding.

2. LDA: Probabilistic topic modelling.

Contribution:

LDAVAE Notations:

VAE Structure:

Encoder:

Decoder:

Classifier:

About

Resources

Stars

Watchers

Forks

Languages