# SimplePlugin development extension

This chapter covers plugin development where existing code to perform a function exists, but needs integrated into PCP framework.

Chapter 2 built a plugin capable of scraping webpages.

This chapter will assume there is a 3rd party library connects over a socket and can scrape content from their IP

this is important because you may want to scrape content from different source IPs and include that as a factor in a job

users may not wish to deploy the entire PCP stack across a broad range of clients or may not be able to modify existing code.  

In [2]:

#This Cell emulates the ramrodpcp system preparing the environment for the plugin
from os import environ
environ["PORT"] = "10000" 
environ["RETHINK_HOST"] = "10.0.0.1"
environ["STAGE"] = "PROD" 
environ['PLUGIN_NAME'] = "SamplePlugin"


In [1]:
# Note: Always get the base class from upstream
# the copy in this repo is vendored current implementation but may not be forward compatible at a later date
from src.controller_plugin import ControllerPlugin


Imported controller_plugin.py file is vendored from 
   https://github.com/ramrod-project/backend-interpreter/blob/master/src/controller_plugin.py  

Ongoing development should obtain a copy from the ramrod-project/backend-interpreter



## Assumptions

1. There are numerous endpoints (identified by IPv4 address) that connect back to this plugin via raw socket
2. Jobs may specify a specific scraper to run the scrape (via target). 
3. Connection with the scraper over raw socket is a proprietary (unknown to PCP) format. 


## Requirement changes
Change requirement # [REQ-F]-2 to include the client address as shown below

The PCP system will return a job for that specific client only (skipping jobs for other clients)

The updated plugin may use self.get_job_for_client(location, port) function

**important note**
* environ['PORT'] is stored as a string
* self.port is stored as an integer
* self.get_job_for_client(location<str>, port<str>) requires the string version of the port.


In [5]:
from src.controller_plugin import ControllerPlugin
import socket

#same command from Chapter 2 
cmdWget = {"CommandName" : "wget",
          "Tooltip" : "Retrieves raw webpage content",
          "Output" : True,
          "Inputs" : [{"Name" : "Target IP Address",
                     "Type" : "textbox",
                     "Tooltip" : "Modify the URL to the desired target IP address",
                     "Value" : ""}]}
FUNCTIONALITY = [cmdWget] # [REQ-C]-2.a
class SamplePlugin(ControllerPlugin): # [REQ-C]-1
    def __init__(self):
        self.name = "SamplePlugin"
        super().__init__(self.name, FUNCTIONALITY) # [REQ-C]-2.b
    def _start(self): # [REQ-C]-3
        self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) ##  Chapter 3 Listen for client web scrapers
        self.s.bind(('localhost', self.port)) ##  # self.port is dictated by the PCP system from user input
        self.s.listen(1) ##  Chapter 3 Listen for client web scrapers
        while True: #  [REQ-F]-1 (simple control loop)
            conn, client_addr = self.s.accept() # Chapter 3 addition (external scraper connected)
            # request job for specific no longer request a job for any client
            job = self.request_job_for_client(client_addr, str(self.port))  # Chapter 3 added
            if job: # [REQ-F]-2.B
                url = job['JobCommand']['Inputs'][0]["Value"]
                #conn.send(url) # send the url to the client scraper system (proprietary link format?)
                output = "" # or conn.receive() # get data back 
                self.respond_output(job, output) # [REQ-F]-3

plugin = SamplePlugin()

## Additional Notes

In this sample, the PCP system may invoke dozens or hundreds of SamplePlugins.  

### PCP Guarantees
1. The PCP system may spin up a fleet of multiple of the same plugins.  
2. The request_job_for_client will give a job to at most one of the plugins which requests one.
3. The request_job_for_client  function will deliver the jobs in ascending order of job['StartTime'] 
4. The request_job_for_client function will never deliver a job to a plugin where job['ExpireTime'] is in the past.
5. The request_job_for_client function will skip jobs for other clients to find the first job for the requested client