Skip to content

Plugins

Andy edited this page May 17, 2024 · 22 revisions

You can develop your plugin for any site you want. I created a plugin environment with full SCrawler integration. You can display any class properties in SCrawler's internal edit forms.

The class that contains all the available objects is the Instagram settings:

You can look at them as an example.

You can also use the following plugins as an example:

  • XVIDEOS - development time 6.5 hours
  • LPSG - development time 1.5 hours

See how easy it is to create a plugin and make it for your site.

The SCrawler.PluginProvider.dll is fully compatible with the most popular language - CSharp.

Net.Framework version is 4.6.1.

Plugins environment

The plugin concept is based on two main interfaces. The first interface - ISiteSettings - is the major interface of the plugin. This class must provide all instances, validators, and other objects required by SCrawler. The second interface - IPluginContentProvider - is an instance of UserData.

How to make a plugin

    1. Create a new Net.Framework library project.
    2. Add a reference to SCrawler.PluginProvider.dll
    1. Create a new settings class (e.g. SiteSettings).
    2. Implement the ISiteSettings interface.
    3. Add the required Manifest class attribute.
    4. Add optional attributes if you need: SeparatedTasks, SavedPosts, SpecialForm
    • Set values for Site, Icon, Image.
    • Site is a required property (cannot be null)
    • If your plugin doesn't have an Icon or Image, these properties must return Nothing (null in C#).
    • Attention! Don't add any attributes to these properties!
  1. Define your own properties if you need to. To allow users to change the property values of your class and interact with SCrawler, this property must be an AutoProperty and must be declared as PropertyValue oblect.
  2. Write the code for the interface functions. If a function has a return type, then that function MUST return a value!
    • Develop a user instance of the IPluginContentProvider interface
    • This instance must be returned by the GetInstance function
  3. If you plugin has special options, develop an options exchange class and options editor form.
  4. It is done. Your plugin is ready to work. 😊

How it works

Download

stateDiagram
classDef OK fill:#063,color:white,font-weight:bold
classDef pluginInt fill:#ff9
class Down OK
class DownSet, GetMedia, ReconfList, GrabBack, Down2 pluginInt

Start: User added to download
HostAvailable: ISiteSettings.Available
HostReady: ISiteSettings.ReadyToDownload
HostBefore: ISiteSettings.BeforeStartDownload
HostAfter: ISiteSettings.AfterDownload
Down: Download
Exit: Exit
Skip: Skip downloading
DownSet: Set class (IPluginContentProvider) parameters (1)
GetMedia: Call IPluginContentProvider.GetMedia function (2)
ReconfList: Reconfigure media list to send to downloader (4)
GrabBack: Get back parameters(3)
Down2: Call IPluginContentProvider.Download function


[*]-->Start

Start-->HostAvailable
HostAvailable-->HostReady: True
HostAvailable-->Exit: False

HostReady-->Skip: False
HostReady-->HostBefore: True
Skip-->Exit

HostBefore-->Down
Down-->DownSet
DownSet-->GetMedia
GetMedia-->GrabBack
GrabBack-->ReconfList
ReconfList-->Down2
Down2-->HostAfter
HostAfter-->Exit

Exit-->[*]
Loading

Notes:

  1. SCrawler sets all IPluginContentProvider properties.
  2. This is where you need to get the posts, their download URLs, and information about the posts (such as date, id, user, etc.).
  3. SCrawler gets back parameters Name, ID, UserDescription, UserExists and UserSuspended.
  4. All posts that are not requested by the user will be removed from the TempMediaList (eg images will be removed if the user only wants to download videos).

Create user

stateDiagram

Start: User creating
HostMyUser: ISiteSettings.IsMyUser
HostOptions: ISiteSettings.UserOptions
Exit
note right of HostOptions
	Open form if OpenForm is true
end note

[*]-->Start

Start-->HostMyUser
HostMyUser-->Exit: False
HostMyUser-->HostOptions: True
HostOptions-->Exit

Exit-->[*]
Loading

Edit site settings

stateDiagram

Start: Call ISiteSettings.BeginEdit
SaveBefore: Call ISiteSettings.BeginUpdate
Save: Call ISiteSettings.Update
SaveAfter: Call ISiteSettings.EndUpdate
Exit: Call ISiteSettings.EndEdit

[*]-->Start

Start-->SaveBefore: Ok
Start-->Exit: Cancel
SaveBefore-->Save
Save-->SaveAfter
SaveAfter-->Exit

Exit-->[*]
Loading

Interfaces

ISiteSettings

interface ISiteSettings: IDisposable
{
    Icon Icon {get;};
    Image Image {get;};
    string Site {get;};
    string CMDEncoding {get; set;};
    IEnumerable<String> EnvironmentPrograms {get; set;};
    string UserAgentDefault {get; set;};
    void EnvironmentProgramsUpdated()
    string AccountName {get; set;};
    bool Temporary {get; set;};
    ISiteSettings DefaultInstance {get; set;};
    bool SubscriptionsAllowed {get;};
    ILogProvider Logger {get; set;};
    string GetUserUrl(IPluginContentProvider User);
    ExchangeOptions IsMyUser(string UserURL);
    ExchangeOptions IsMyImageVideo(string URL);
    IPluginContentProvider GetInstance(Download What);
    void BeginInit();
    void EndInit();
    string AvailableText {get; set;};
    bool Available();
    bool ReadyToDownload();
    void DownloadStarted(Download What);
    void BeforeStartDownload(object User, Download What);
    void AfterDownload(object User, Download What);
    void DownloadDone(Download What);
    ISiteSettings Clone(bool Full);
    void Delete();
    void BeginEdit();
    void EndEdit();
    void BeginUpdate();
    void EndUpdate();
    void Update();
    void Update(ISiteSettings Source);
    void Reset();
    void OpenSettingsForm();
    void UserOptions(ref object Options, bool OpenForm);
    string GetUserPostUrl(IPluginContentProvider User, IUserMedia Media);
}

Site, Icon and Image are properties that provide the site name and site icon (in Icon and Image formats).

CMDEncoding - user-selected the command-line encoding.

EnvironmentPrograms - user-selected paths to programs such as yt-dlp, gallery-dl, ffmpeg, curl, etc.

UserAgentDefault - UserAgent that the user has configured in the global settings form

EnvironmentProgramsUpdated - this function will be called when EnvironmentPrograms or CMDEncoding changes

AccountName property set before calling the BeginInit function.

Temporary property means that this instance is temporary. Use when cloning an existing instance or create a new one (before saving data).

The DefaultInstance property will set to default instance for an instances that are not default. In the default instance, this value is Nothing (null in c#).

AvailableText property must store the notification text (if it exists) when the Available function is called with the argument Silent = True.

SubscriptionsAllowed indicates that the plugin allows users to be created in subscription mode.

The Clone function must return a clone of the current instance in order to create a new one (based on the current one).

The Delete function is called when the user deletes the current profile.

IsMyUser and IsMyImageVideo are URLs validators. These functions should return a value (ExchangeOptions with Exists set to true) indicating whether the user or media belongs to your plugin.

GetInstance must return an instance for downloading data. The What parameter specifies what type of instance is being requested.

BeginInit and EndInit are functions that are called when the plugin is initializing and completed initialization.

BeginEdit and EndEdit are functions that are called when the plugin settings are being edited by the SCrawler site editor form.

BeginUpdate, EndUpdate and Update are function that are called by the site editor form (when the plugin settings have changed and need to be saved) and by the settings class.

The Update(Source) function is called when the user has created a new instance and added it to SCrawler. The Source argument stores the data that must be copied to the current instance.

Available function must return a value indicating that the site is available. If false is returned, users will not be downloaded. Called when jobs the pool is created.

ReadyToDownload function must return the same value as the Available function, but it is called before particular user actually downloads.

BeforeStartDownload and AfterDownload are functions that are called before and after a particular user downloads.

The DownloadDone function is called after all jobs have completed.

If your plugin has an additional settings form, then this form must be opened in ShowDialog mode when calling the OpenSettingsForm function.

UserOptions is the function for exchanging additional user options when creating a user. If OpenForm argument is true, you must open a form (in ShowDialog mode) where the user can manage additional user options.

GetUserPostUrl must return the post url provided by the Media argument.

IDownloadableMedia

interface IDownloadableMedia : IUserMedia, IDisposable
{
    event EventHandler CheckedChange;
    event EventHandler ThumbnailChanged;
    event EventHandler StateChanged;
    Icon SiteIcon {get;};
    string Site {get;};
    string SiteKey {get;};
    string AccountName {get; set;};
    string ThumbnailUrl {get; set;};
    string ThumbnailFile {get; set;};
    string Title {get; set;};
    int Size {get; set;};
    TimeSpan Duration {get; set;};
    object Progress {get; set;};
    bool HasError {get;};
    bool Exists {get;};
    bool Checked {get; set;};
    IPluginContentProvider Instance {get; set;};
    void Download(bool UseCookies,Threading.CancellationToken Token);
    void Delete(bool RemoveFiles);
    void Load(string File);
    void Save();
    string ToString();
    string ToString(bool ForMediaItem);
}

Raise the CheckedChange event when value Checked has changed. Raise the ThumbnailChanged event when value ThumbnailUrl or ThumbnailFile has changed. Raise the StateChanged event when value DownloadState has changed.

Site is the name of your plugin's site. SiteKey is the name of the plugin. ThumbnailUrl, ThumbnailFile, Title, Size and Duration are information about the media being downloaded.

Progress is a class that manages a progress bar. Don't set this property manually. This property is necessary for the internal needs of SCrawler.

Exists is a property indicating that media is parsed or loaded from a file. Should be true if the file downloaded and exists or not downloaded and the URL is OK.

Checked is a value indicating the checked state on the form.

Instance is an instance of IPluginContentProvider. Don't set this property manually. It will be set later during SCrawler's internal algorithms.

Download is a function that will be called when the user downloads a file.

Delete is the function that will be called when the user requests to remove a file from the list. RemoveFiles indicating that the user also wants to delete the downloaded file.

Load is the function that will be called when the user opens the downloader form. Save is the function that will be called when the user downloads the file.

ToString is the function that should returns title of the media. ForMediaItem is an argument indicating that the title requesting a list media item.

IPluginContentProvider

interface IPluginContentProvider : IDisposable
{
    delegate void ProgressChangedEventHandler(int Count);
    event ProgressChangedEventHandler ProgressChanged;
    delegate void ProgressMaximumChangedEventHandler(int Value, bool Add);
    event ProgressMaximumChangedEventHandler ProgressMaximumChanged;
    event ProgressChangedEventHandler ProgressPreChanged;
    event ProgressMaximumChangedEventHandler ProgressPreMaximumChanged;
    IThrower Thrower {get; set;};
    ILogProvider LogProvider {get; set;};
    ISiteSettings Settings {get; set;};
    string AccountName {get; set;};
    string Name {get; set;};
    string ID {get; set;};
    string Options {get; set;};
    bool ParseUserMediaOnly {get; set;};
    string UserDescription {get; set;};
    List<IUserMedia> ExistingContentList {get; set;};
    List<string> TempPostsList {get; set;};
    List<IUserMedia> TempMediaList {get; set;};
    bool UserExists {get; set;};
    bool UserSuspended {get; set;};
    bool IsSavedPosts {get; set;};
    string IsSubscription {get; set;};
    bool SeparateVideoFolder {get; set;};
    string DataPath {get; set;};
    Nullable<int> PostsNumberLimit {get; set;};
    Nullable<DateTime> DownloadDateFrom {get; set;};
    Nullable<DateTime> DownloadDateTo {get; set;};
    object ExchangeOptionsGet();
    void ExchangeOptionsSet(object Obj);
    void XmlFieldsSet(List<KeyValuePair<string, string>> Fields);
    List<KeyValuePair<string, string>> XmlFieldsGet();
    void GetMedia(Threading.CancellationToken Token);
    void Download(Threading.CancellationToken Token);
    void ResetHistoryData();
}

Raise the ProgressChanged event when the file is downloaded. Raise the ProgressMaximumChanged event before downloading files. The Value argument specifies the number of media files to download.

Raise the ProgressPreChanged event when parsing content. Raise the ProgressPreMaximumChanged event when the amount of parse content has changed. The Value argument specifies the amount of content to be parsed.

See the Thrower , LogProvider and Settings property environments in their respective chapters. Don't set these properties manually. They will be set by SCrawler.

Name is the username returned from the settings class when a new user is created. ID is the user ID on the site. UserDescription is the user description. UserExists and UserSuspended are properties indicating that the user is exists/suspended on the site. Options are the options that can be sent from the ISiteSettings. IsSubscription indicates that the you must set the URL of the IUserMedia structure to the URL of the video preview image (screenshot), not to the URL of the video file.

ExchangeOptionsGet and ExchangeOptionsSet are the functions for exchanging additional user options when the user is being edited by the user.

XmlFieldsGet and XmlFieldsSet are the functions for exchanging additional user options when the user is being loaded by the SCrawler (from an XML file).

ParseUserMediaOnly, IsSavedPosts, PostsNumberLimit, DownloadDateFrom, DownloadDateTo are properties that define what the user want to download.

SeparateVideoFolder is a property that specifies that the video content should be separated from other content (such as images).

TempPostsList is a list of all post IDs downloaded by the plugin. ExistingContentList is a list of posts downloaded by the plugin. TempMediaList is a temporary list of posts which should be downloaded by the plugin.

GetMedia: collect posts from the site and add then to the TempMediaList list. Download: download posts provided by the TempMediaList list.

ResetHistoryData is a function that will be called when the user resets the downloaded history.

IUserMedia

This is a interface to create a class or structure to exchange data (to be downloaded) between your plugin and SCrawler. There is already a PluginUserMedia structure that implements this interface, but you can create your own.

IPropertyProvider

interface IPropertyProvider : IFormatProvider
{
    string PropertyName {get; set;};
}

If your IFormatProvider class implements this interface, the PropertyName property will be set to the currently editable property name by SCrawler's internal algorithms.

IThrower

This is an exception thrower. If the user has requested to cancel the operation or delete the user instance, an exception will be thrown. Use the IThrower.ThrowAny method in your code to stop execution when requested!

Exceptions:

  • OperationCanceledException - when user requested to cancel the operation
  • ObjectDisposedException - when user deletes the user instance

ILogProvider

This is an interface for sending exceptions and messages to the program log.

Functions:

  • Add(ByVal Message As String) - add Message to the program log
  • Add
    • ex - your exception
    • Message - additional message
    • ShowMainMsg - show main message
    • ShowErrorMsg - show error message
    • SendToLog - message and error message will be sent to the program log

Objects

PropertyValue

This is the base object for interacting with SCrawler and bi-directional communication.

There are three initialization constructors:

  • By initial value. If the value is null, this constructor will throw an error! Otherwise, the type will be extracted from the value.
  • By value and type.
  • By value, type and function. If you want, you can delegate a function to be called when the value changes. You can see an example in the Instagram SiteSettings class

The Checked parameter is intended to define the inheritance value and other parameters that you can use with the OnCheckboxCheckedChange function.

If your initial value is null, you MUST set the type.

Use BeginInit and EndInit to pause handlers. Use Clone to properly clone a property using the ISiteSettings Clone and Update functions.

Only these types are available

IPropertyValue

Don't use this interface. This interface is only compatible with SCrawler. Always use the PropertyValue class!

PropertyData

This is a structure for exchanging properties between classes without saving. It is currently used for PropertiesDataChecker attribute.

  • Name - property name
  • Value - property value

ExchangeOptions

For your convenience, this structure provides two default initializers. Use them so you don't forget the fields.

Currently used in IsMyUser and IsMyImageVideo

  • UserName - user name or any other data (if specified in the method description)
  • SiteName - site name
  • HostKey - set automatically via SettingsHost
  • Options - are the options you want to send to the created user.
  • Exists

ExitException

Represents errors that occur during downloading to be thrown to the root downloading function.

  • SimpleLogLine - add only the message to the log, without adding a StackTrace. Default: True.
  • Silent - don't add a message to the log. Default: False.

Attributes

PropertyOption

This attribute allows you to add your property to the Scrawler site settings form. Only works with PropertyValue object.

Options:

  • PropertyName - automatically set when attribute is initialized. Corresponds to the property name.
  • Type - property value type. Can be obtained automatically from PropertyValue.
  • ControlText - this text will be displayed on the control in the settings form.
  • ControlToolTip - Control ToolTip.
  • ThreeStates - CheckBox option. If true, then the CheckBox will be displayed in three states: Checked, Unchecked, Indeterminate. Default: false.
  • AllowNull - If false, then the settings form will display an error message when trying to save the property value if the property value is null. Default: true.
  • LeftOffset - Just a design option. This is just a control offset from the left border of the form (just for beauty).
  • IsAuth - Default: false. If at least one property has a PropertyOption with this parameter, then the controls in the settings form will be divided into two blocks: Authorization and Other. Just a design option.
  • InheritanceName - the name of a constant from global settings for value inheritance and automatic updating.
  • Category - category name.
  • IsInformationLabel - Just a design option. Specifies that this property is just information.
  • LabelTextAlign - The alignment of the control (label) text, if IsInformationLabel.

DependentFields

This attribute specifies which properties should be updated in the settings form when the property to which this attribute belongs is updated. Only works with PropertyValue object and PropertyUpdater attribute.

PXML

This attribute specifies that your property should be added to the SCrawler settings XML file. When launched, SCrawler will replace the value of your property with the value from the XML settings file.

Again. Only works with PropertyValue object.

DoNotUse

This attribute allows you to exclude a property from the setting environment. Suitable for overrides.

  • Value - a value that indicates whether this field should be used or not. True = don't use; False = Use.

Only works with PropertyValue object.

PropertyUpdater

This attribute provides a mechanism for updating a specific property. Attributed can only be applied to a method. Allows multiple definitions. Function MUST return a Boolean (bool in C#) with true indicating that the value has been updated and false otherwise.

  • UpdatingPropertyName - the name of the property to be updated
  • Arguments - an array of arguments to be sent to your function. The argument is a name of a property in your class. The length of the array must match the number of function arguments.

Manifest

This attribute provides the plugin key to associate users with the plugin. I recommend using the pattern: DeveloperName_Site.

SpecialForm

I have developed a fully integrated plugin environment. This means that any of your class property can be displayed in the SCrawler internal forms (for example, SettingsForm, UserCreatorForm). But if your plugin requires authorization or you want to manipulate your settings in a separate form, you can use the SpecialForm class attribute. This attribute allows multiple definitions for the same class and two working modes.

SpecialForm(True), for class settings. This means that an additional button will be added to the site settings form to opening your settings form. In this case, it is necessary to develop an additional form that must be opened (using the ShowDialog method) by the OpenSettingsForm class function.

SpecialForm(False), for user settings. This means that the Options button will be enabled on the user creator form. In this case, you must develop an additional exchange class and form. This works with the UserOptions class function. This function has two arguments: Options (reference) and OpenForm.

Provider

Options:

  • FieldsChecker - Read below for this option.
  • Interaction - If True, the value in the text field will be changed via the provider when the text changes. The default value is False.

This attribute indicates that the property is a provider. The provider must be an IFormatProvider.

There are two working ways: check fields (FieldsChecker) and not.

FieldsChecker = true. Convert text from TextBox in settings form to value

FieldsChecker = false. Converting a value by a specific converter. Also used to convert a value from XML to an object.

ControlNumber

Just a design attribute. Allows you to place controls on the settings form in a specific order. The number means the zero-based control position at the top of the form.

PropertiesDataChecker

An attribute for the validation method. Names is an array of property names to check values. The method must be a function with an IEnumerable argument of PropertyData and boolean return.

Returning true means the values are OK.

If you need to check the values of some properties before saving the settings, use this attribute.

SeparatedTasks

If your plugin needs to run on a separate thread add the SeparatedTasks class attribute. If you want to limit download tasks per your plugin, you can set number of tasks for this attribute. For example: SeparatedTasks(1) means that only one user will be downloaded per task; SeparatedTasks(2) means that two users will be downloaded at the same time for each task. The number argument is optional. This means that if you don't specify a tasks number, the global tasks number will be used instead. In additional, the TaskCounter property attribute has the highest priority. This means that if you make a property (to allow users to change the number of tasks) with a TaskCounter attribute, the number of SeparatedTasks will be ignored. But the SeparatedTasks attribute MUST be added to the class anyway if you want to run your plugin on the separate thread.

TaskCounter

If you want users to be able to control the number of download tasks in a thread, use this attribute. Only works with SeparatedTasks

TaskGroup

Group this plugin into a specific download group. Look at the code to see how to use this attribute.

SavedPosts

If your plugin provides downloading of saved posts, add the SavedPosts class attribute.

UseInternalDownloader

User data instance attribute (for IPluginContentProvider). Means that after parsing, SCrawler should use the internal downloader.

Github

Assembly attribute. Not used yet, but in future versions will allow SCrawler to check for updates to your plugin.

  • Name - your GitHub username
  • RepoName - your repository name

ReplaceInternalPluginAttribute

If you want your plugin to overwrite the internal one, use this attribute.

PropertyValue value types

vb.net C# NET
Boolean bool System.Boolean
Byte byte System.Byte
SByte sbyte System.SByte
Short short System.Int16
UShort ushort System.UInt16
Integer int System.Int32
UInteger uint System.UInt32
Long long System.Int64
ULong ulong System.UInt64
Double double System.Double
Single float System.Single
Decimal decimal System.Decimal
String string System.String
Date DateTime System.DateTime
TimeSpan TimeSpan System.TimeSpan