AI Content Detector (https://aicontentdetector.streamlit.app/)

In this project, I have developed an AI content detector using perplexity and burstiness concepts in NLP.

Perplexity: Perplexity measures how well a probability model predicts a sample, particularly in Natural Language Processing (NLP). A language model, which generates and evaluates sentences, should assign higher probabilities to well-written texts. Perplexity thus captures a model's uncertainty in predicting text. For example, given a trained language model that predicts words from a limited set, the probability of the sentence "a red fox." is calculated by multiplying the probabilities of each word conditional on its predecessors: P("a red fox.") = P("a") * P("red" | "a") * P("fox" | "a red") * P("." | "a red fox").

Burstiness: In a unigram model, the distribution of a word is evenly spread out across events (words) and could be represented as a repeated Bernoulli trial with probability P(w). This model works for most functional words, but content words have different distributions, for which a bigram model is used. Here I'm using the GPT-2 transformer. User can check if the text contains AI content and replace it with the provided review when analyzing the text to reduce the chances of AI content.

Used technology :

Streamlit
Streamlit.io for deployment
Machine Learning
NLP (standard measures of perplexity and burstiness and preprocessing of the model)
Data visualization
Pytorch
Matplotlib

Other model training on the AI_Human dataset involves the following steps:

NLP for data preprocessing,
Created a pipeline which includes CountVectorizer, TfidfTransformer, and MultinomialNB model. The accuracy achieved is 95%.

1) paste the text from the public pdf.

Result : Ai content not detected in the text.

2) Text is generated by chatgpt :

Result : Ai content detected in the text. review shows that in top 10 word you have a ai content change that content to get the less content of AI.

Quilbot result :

3) download the review :

Steps :

Download the files from my gtihub account https://github.com/neha13rana/AIcontentdetector or just clone the website by using git command git clone https://github.com/neha13rana/AIcontentdetector.git if you have a set up of git in your device.
set the virtual environment by writing : 1. python -m venv venv 2. venv\Scripts\activate
than install requirements.txt to install the depencies 3. pip install -r requirements.txt .
after installing run your app by using the command 4. streamlit run app.py
open the local/network URL on your browser.
Enter your text and check if your content is ai free or not and want to decrease the amount of ai content than change the words by using the review provided by the site review.

Resources : Kaggle

https://medium.com/nlplanet/two-minutes-nlp-perplexity-explained-with-simple-probabilities-6cdc46884584

https://nlp.fi.muni.cz/raslan/2011/paper17.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.devcontainer		.devcontainer
AIgenerated text detction using dataset and ml.ipynb		AIgenerated text detction using dataset and ml.ipynb
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Content Detector (https://aicontentdetector.streamlit.app/)

About

Releases

Packages

Languages

neha13rana/AI-Content-Detector

Folders and files

Latest commit

History

Repository files navigation

AI Content Detector (https://aicontentdetector.streamlit.app/)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages