# Boost your pipeline with pipeline-turbo

pipeline-turbo is a package that will accelerate your processing pipeline. It works with the multi-threading concept in the background. It has been successful in both CPU and GPU tasks.

The only pre-requisite is to load the function running for a single process and adjust the threads according to your resource availability.

##### Author: Deepak John Reji (www.linkedin.com/in/deepak-john-reji)

### Step-1 : Install pipeline-turbo using pip command

In [1]:
!pip install pipeline-turbo





### Step-2, Get your data ready

In [2]:
# let's get some data for processing
sentences = ["Nevertheless, Trump and other Republicans have tarred the protests as havens for terrorists intent on destroying property.", "Billie Eilish issues apology for mouthing an anti-Asian derogatory term in a resurfaced video.", "Christians should make clear that the perpetuation of objectionable vaccines and the lack of alternatives is a kind of coercion.", "There have been a protest by a group of people", "While emphasizing heâ€™s not singling out either party, Cohen warned about the danger of normalizing white supremacist ideology."]

sentences1 = sentences * 100
print(f"total sentences : {len(sentences1)}")

total sentences : 500


### Step-3, Create your pipeline

In [3]:
# Create your process - Here is an example of running a bias detection model across few sentences
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("d4data/bias-detection-model")
model = TFAutoModelForSequenceClassification.from_pretrained("d4data/bias-detection-model")
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer, device=0) 

All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at d4data/bias-detection-model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


In [4]:
def bias_classification(text):
    out = classifier(text)
    #label_ = out[0]['label']
    #probability_ = out[0]['score']
    
    return out 

### Without turbo

In [5]:
import time
start = time.time() 

# without turbo, looping across all the sentences - Normal Method
out_list = []
for sent in sentences1:
    out = bias_classification(sent)
    out_list.append(out)

print("Time taken for execution" + ": " + str((time.time()-start)/60) + " minutes")
print("Extraction is completed.......!")

Time taken for execution: 0.45236920515696205 minutes
Extraction is completed.......!


### With turbo

In [6]:
from pipeline_turbo.turbo import turbo_threading # import the turbo threading function which does the magic

In [7]:
turbo_out = turbo_threading(sentences1,bias_classification, num_threads=5)

500it [00:27, 18.42it/s]


In [8]:
turbo_out['Billie Eilish issues apology for mouthing an anti-Asian derogatory term in a resurfaced video.']

[{'label': 'Biased', 'score': 0.8979015946388245}]