Skip to content

Fine tuning AraGPT2 in order to generate ads in Darija

Notifications You must be signed in to change notification settings

IssamLL/moroccan-ads-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Moroccan Ads Generation Using NLP

Introduction

In this project, we are trying to solve an NLP task, which is generating Moroccan ads in Darija.
In order to achieve this goal, we decided it would be better to use a pretrained model, build a dataset that is suitable to our task, and then fine-tune the model on this dataset. The pretrained model we decided to work on is the araGPT 2 model, which is used for Arabic text generation.
The dataset was built using web scraping techniques.

How to Use

If you are an entrepreneur, who just came up with a brand new product, and you want to advertize it in Morocco. You want to generate an ad for your product, but you don't know how to write it, that's where our model comes in handy.
You can basically start the prompt by typing a very few words about your product, and the model will try to generate a well written script, that is as relevant as possible to your target customers.

Demo

Starting prompt:

اتاي

Generated ad:

! اتااااااااي

Limitations

The main limitation we faced during the development of this project is:

  • The very limited resources concerning Moroccan ads, which made it hard to build a dataset that is big enough to train our model on.

Business Value

Once the limitation cited above is overcome, this project would be powerful enough to make ads that are very mmuc tailored to the Moroccan culture, without any need to hire a copywriter, which would save a lot of time and money. Also, if used in combination with a recommender system, it would be able to generate ads that are tailored to each customer, which would increase the chances of the customer buying the product. It could also be used with a diffusion model, to generate ads that could be displayed on social media, or shorts, or promote small business.

Installation

To install the required packages, run the following command:

pip install -r requirements.txt

Packages Used

  • Transformers
  • Python
  • BeautifulSoup
  • Selenium
  • Pandas ...

DarijaBERTGenAD Model

This repository contains the pre-trained DarijaBERTGenAD model, which is based on the BERT architecture and fine-tuned for Darija, a North African Arabic dialect.

Hugging Face Model Repository

You can access the pre-trained model and use it in your NLP projects through the Hugging Face model repository. Here is the link to the model:

Link to DarijaBERTGenAD on Hugging Face

Usage

To use this model in your code, you can leverage the Hugging Face Transformers library. Here's an example of how to load the model in Python:

from transformers import AutoModel, AutoTokenizer

# Load the tokenizer and model
model_name = "IssamL/darijabertgenad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Use the model for your NLP tasks

Aknowledgements

The authors that have contributed to this project are:

This is the project that my team and I participated in, during the Hackathon organized by 1337 School, and sponsored by the 1337 School, huge thanks to them for this opportunity.

In addition to the mentors that have helped us tailor our solution and make it more doable.

About

Fine tuning AraGPT2 in order to generate ads in Darija

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •