Skip to content
This repository has been archived by the owner on Jun 19, 2020. It is now read-only.

Latest commit

 

History

History
149 lines (99 loc) · 7.18 KB

writing_scripts.rst

File metadata and controls

149 lines (99 loc) · 7.18 KB

Writing scripts

At its core, Phantombuster allows you to script "web robots" in two languages: JavaScript and CoffeeScript.

To create a new bot script, log in, go to your scripts page and simply enter a name. Click Advanced to select what kind of script you want to create.

Each script can be launched on our platform by one of the following commands (in fact, binaries):

  • Node — Execute your scripts in V8, Chrome's JavaScript runtime
  • PhantomJS — Headless, scriptable WebKit browser (where Phantombuster got its name from)
  • CasperJS — Framework built on top of PhantomJS to easily write complex navigation scenarios

You can also write your own modules to better control and optimize how your robots will navigate the web. Or use the one made by the Phantombuster team: Nick.

Quick start (read this!)

If you are in a hurry, please read at least the four following sections.

The best way to understand and get started quickly with Phantombuster is to try some sample scripts.

Once logged in (log in here if you haven't already), check out some of these scripts (choose your preferred language and framework or the first one if you don't know):

When viewing a script, click Quick Launch in the top right corner to run it. You'll see your script execution in real-time. Just below the console output, your persistent storage is displayed (it's where your saved files will show up).

Do not hesitate to copy-paste these scripts to test more features.

Agents and scripts

Now that you launched your first few scripts, you probably noticed that they run within an agent (if you used the Quick Launch feature, the agent you used was named Quick Launch Agent).

Agents are configuration settings that describe how to run a certain script. They allow you to control how and when a script is launched. The combination of a script and an agent gives you a full featured "web robot" that can scrape and automate stuff on the web.

To create an agent, go to your agents page and enter a name of your choice. The most important settings of an agent are which script to launch and when to launch it. But you'll see there are a lot of other options...

In what environment do my scripts run?

Your scripts are executed in Linux containers (they are similar to very light virtual machines).

Available to you are a few gigabytes of RAM, a few gigabytes of hard disk space and a fast internet connection. These are temporary resources that are freed right after your agent finishes its job.

What's important to know is that files written on your agent's disk will be lost when it exits. To keep files, save them to your persistent storage using our agent module <agent-module-file-storage>.

More technical details (for the nerds):
  • The container engine is Docker
  • Containers are running Debian
  • Agents always start in /home/phantom/agent which is empty
  • Agents run under the user phantom

Phantombuster's SDK

All your scripts can easily be written right on our website, in the provided CoffeeScript/JavaScript web editor.

However, you might prefer using your own editor, locally on your machine. We made Phantombuster's SDK specifically for this.

The SDK will monitor a directory on your disk for changes in your scripts. As soon as a change is detected, the script will be uploaded in your Phantombuster account.

First, you need to have npm installed. Then do this:

# npm install -g phantombuster-sdk

It will globally install the phantombuster command. Discover how to use it → <SDK>

Requiring other scripts

All your scripts (and samples/libraries) can be required. The requiring script must have a phantombuster dependencies directive (similar to "use strict";) listing its dependencies.

"use strict";
"phantombuster command: casperjs";
"phantombuster package: 2";
// Comma separated list of dependencies
// Specify the full name (with extension)
"phantombuster dependencies: lib-Foo.js, lib-Nick-beta.coffee";

// The rest of your script...
MyLib = require("lib-Foo");
Nick = require("lib-Nick-beta");

Writing your own modules

When the name of a script starts with lib, its launch will be disabled. This allows you to safely write reusable modules that can later be required using phantombuster dependencies and then require().

To create a new module, log in, go to your scripts page, select the reusable module tab and enter your module name.

// In script "lib-Foo.js"
"use strict";

module.exports = {
    foo: function() {
        console.log("bar");
    }
}
// In script "my-script.js"

"use strict";
"phantombuster command: casperjs";
"phantombuster package: 2";
"phantombuster dependencies: lib-Foo.js";

require("lib-Foo").foo(); // outputs "bar"

There are a few more subtleties to consider when writing your own modules → <writing-modules>

Locking a script's launch command

If you want to make sure a script is always launched with the same command, add a phantombuster command directive (similar to "use strict";).

// Possible values are: casperjs, phantomjs and node
"phantombuster command: node";
"phantombuster package: 2";
"use strict";

// The rest of your script...
needle = require("needle");