Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
python: send "bytes" instead of "str" to callbacks in Python 3 when t…
…he string is not UTF-8 valid (issue #1220, closes #1389)
  • Loading branch information
flashcode committed Oct 12, 2019
1 parent 8fc8f72 commit 513f5a1
Show file tree
Hide file tree
Showing 13 changed files with 903 additions and 200 deletions.
1 change: 1 addition & 0 deletions ChangeLog.adoc
Expand Up @@ -46,6 +46,7 @@ Bug fixes::
* irc: remove option irc.network.channel_encode, add server option "charset_message" to control which part of the IRC message is decoded/encoded to the target charset (issue #832)
* irc: use path from option xfer.file.upload_path to complete filename in command "/dcc send" (issue #60)
* logger: fix write in log file if it has been deleted or renamed (issue #123)
* python: send "bytes" instead of "str" to callbacks in Python 3 when the string is not UTF-8 valid (issue #1389)
* xfer: fix memory leak when a xfer is freed and when the plugin is unloaded

Tests::
Expand Down
174 changes: 143 additions & 31 deletions doc/de/weechat_scripting.de.adoc
Expand Up @@ -3,9 +3,10 @@
:email: flashcode@flashtux.org
:lang: de
:toc: left
:toclevels: 3
:toclevels: 4
:toc-title: Inhaltsverzeichnis
:sectnums:
:sectnumlevels: 3
:docinfo1:


Expand Down Expand Up @@ -73,22 +74,95 @@ und die Dokumentation für die Funktion `hook_process` in link:weechat_plugin_ap

==== Python

* WeeChat muss als Modul eingebunden werden: `import weechat`
* Um die WeeChat Funktion `+print*+` nutzen zu können muss `+prnt*+` genutzt
werden (_print_ ist ein reservierter Befehl von Python!)
* Funktionen werden im Format `weechat.xxx(arg1, arg2, ...)` ausgeführt
// TRANSLATION MISSING
===== Module

WeeChat defines a `weechat` module which must be imported with `import weechat`.

// TRANSLATION MISSING
===== Functions

Functions are called with `weechat.xxx(arg1, arg2, ...)`.

Functions `+print*+` are called `+prnt*+` in python (because `print` was a
reserved keyword in Python 2).

// TRANSLATION MISSING
===== Strings received in callbacks

In Python 3 and with WeeChat ≥ 2.7, the strings received in callbacks have type
`str` if the string has valid UTF-8 data (which is the most common case),
or `bytes` if the string is not UTF-8 valid. So the callback should take care
about this type if some invalid UTF-8 content can be received.

Some invalid UTF-8 data may be received in these cases, so the callback can
receive a string of type `str` or `bytes` (this list is not exhaustive):

[width="100%",cols="3m,3m,3m,8",options="header"]
|===
| API function | Arguments | Examples | Description

| hook_modifier |
irc_in_yyy |
pass:[irc_in_privmsg] +
pass:[irc_in_notice] |
A message received in IRC plugin, before it is decoded to UTF-8 (used
internally). +
+
It is recommended to use modifier `irc_in2_yyy` instead, the string received
is always UTF-8 valid. +
See function `hook_modifier` in the
link:weechat_plugin_api.en.html#_hook_modifier[WeeChat plugin API reference].

| hook_signal |
xxx,irc_out_yyy +
xxx,irc_outtags_yyy |
pass:[*,irc_out_privmsg] +
pass:[*,irc_out_notice] +
pass:[*,irc_outtags_privmsg] +
pass:[*,irc_outtags_notice] |
A message sent by IRC plugin, after it is encoded to the `encode` charset
defined by the user (if different from the default `UTF-8`). +
+
It is recommended to use signal `xxx,irc_out1_yyy` instead, the string received
is always UTF-8 valid. +
See function `hook_signal` in the
link:weechat_plugin_api.en.html#_hook_signal[WeeChat plugin API reference].

| hook_process +
hook_process_hashtable |
- |
- |
Output of the command, sent to the callback, can contain invalid UTF-8 data.

|===

In Python 2, which is now deprecated and should not be used any more, the
strings sent to callbacks were always of type `str`, and may contain invalid
UTF-8 data, in the cases mentioned above.

==== Perl

* Funktionen werden im Format `weechat::xxx(arg1, arg2, ...);` ausgeführt
// TRANSLATION MISSING
===== Functions

Functions are called with `weechat::xxx(arg1, arg2, ...);`.

==== Ruby

* Es muss _weechat_init_ definiert und darin die Funktion _register_ ausgeführt werden
* Funktionen werden im Format `Weechat.xxx(arg1, arg2, ...)` ausgeführt
* Aufgrund einer Limitierung, seitens Ruby (maximal 15 Argumente pro Funktion), empfängt
die Funktion `Weechat.config_new_option` den Callback in einem Array von 6 Strings
(3 Callbacks + 3 Data Strings), somit sieht ein Aufruf der Funktion folgendermaßen aus:
// TRANSLATION MISSING
===== Initialization

You have to define _weechat_init_ and call _register_ inside.

// TRANSLATION MISSING
===== Functions

Functions are called with `Weechat.xxx(arg1, arg2, ...)`.

Due to a limitation of Ruby (15 arguments max by function), the function
`Weechat.config_new_option` receives the callbacks in an array of 6 strings
(3 callbacks + 3 data strings), so a call to this function looks like:

[source,ruby]
----
Expand All @@ -98,29 +172,46 @@ Weechat.config_new_option(config, section, "name", "string", "description of opt

==== Lua

* Funktionen werden im Format `weechat.xxx(arg1, arg2, ...)` ausgeführt
// TRANSLATION MISSING
===== Functions

Functions are called with `weechat.xxx(arg1, arg2, ...)`.

==== Tcl

* Funktionen werden im Format `weechat::xxx arg1 arg2 ...` ausgeführt
// TRANSLATION MISSING
===== Functions

Functions are called with `weechat::xxx arg1 arg2 ...`.

==== Guile (Scheme)

* Funktionen werden im Format `(weechat:xxx arg1 arg2 ...)` ausgeführt
* folgende Funktionen nutzen eine Liste von Argumente (anstelle von vielen
Argumenten für andere Funktionen), dies liegt daran das Guile die Anzahl
der Argumente eingeschränkt ist:
** config_new_section
** config_new_option
** bar_new
// TRANSLATION MISSING
===== Functions

Functions are called with `(weechat:xxx arg1 arg2 ...)`.

The following functions take one list of arguments (instead of many arguments
for other functions), because number of arguments exceed number of allowed
arguments in Guile:

* config_new_section
* config_new_option
* bar_new

==== JavaScript

* Funktionen werden im Format `weechat.xxx(arg1, arg2, ...);` ausgeführt
// TRANSLATION MISSING
===== Functions

Functions are called with `weechat.xxx(arg1, arg2, ...);`.

==== PHP

* Funktionen werden im Format `weechat_xxx(arg1, arg2, ...);` ausgeführt
// TRANSLATION MISSING
===== Functions

Functions are called with `weechat_xxx(arg1, arg2, ...);`.

[[register_function]]
=== Die "Register" Funktion
Expand Down Expand Up @@ -1103,15 +1194,25 @@ weechat.prnt("", "Wert der Option weechat.color.chat_delimiters ist: %s"
[[irc_catch_messages]]
==== Nachrichten abfangen

Die IRC Erweiterung sendet zwei Signale wenn eine Nachricht empfangen wurde.
`xxx` ist der interne IRC Servername, `yyy` ist der IRC Befehl der empfangen
wurde (JOIN, QUIT, PRIVMSG, 301, ..):
// TRANSLATION MISSING
IRC plugin sends four signals for a message received (`xxx` is IRC internal
server name, `yyy` is IRC command name like JOIN, QUIT, PRIVMSG, 301, ..):

xxxx,irc_in_yyy::
Signal wird gesendet bevor die Nachricht verarbeitet wurde.
// TRANSLATION MISSING
xxx,irc_in_yyy::
signal sent before processing message, only if message is *not* ignored

// TRANSLATION MISSING
xxx,irc_in2_yyy::
Signal wird gesendet nachdem die Nachricht verarbeitet wurde.
signal sent after processing message, only if message is *not* ignored

// TRANSLATION MISSING
xxx,irc_raw_in_yyy::
signal sent before processing message, even if message is ignored

// TRANSLATION MISSING
xxx,irc_raw_in2_yyy::
signal sent after processing message, even if message is ignored

[source,python]
----
Expand All @@ -1133,8 +1234,19 @@ weechat.hook_signal("*,irc_in2_join", "join_cb", "")
[[irc_modify_messages]]
==== Nachrichten ändern

Die IRC Erweiterung verschickt einen "modifier" mit Namen "irc_in_xxx" ("xxx" steht für den
Namen des IRC Befehls) falls eine Nachricht empfangen wurde die dann modifiziert werden kann.
// TRANSLATION MISSING
IRC plugin sends two "modifiers" for a message received ("xxx" is IRC command),
so that you can modify it:

// TRANSLATION MISSING
irc_in_xxx::
modifier sent before charset decoding: use with caution, the string may
contain invalid UTF-8 data; use only for raw operations on a message

// TRANSLATION MISSING
irc_in2_xxx::
modifier sent after charset decoding, so the string received is always
UTF-8 valid (*recommended*)

[source,python]
----
Expand All @@ -1143,7 +1255,7 @@ def modifier_cb(data, modifier, modifier_data, string):
# (Okay dies ist nicht wirklich sinnvoll, aber es ist auch nur ein Beispiel!)
return "%s %s" % (string, modifier_data)
weechat.hook_modifier("irc_in_privmsg", "modifier_cb", "")
weechat.hook_modifier("irc_in2_privmsg", "modifier_cb", "")
----

[WARNING]
Expand Down
13 changes: 10 additions & 3 deletions doc/en/weechat_plugin_api.en.adoc
Expand Up @@ -9871,12 +9871,16 @@ List of signals sent by WeeChat and plugins:
| irc | xxx,irc_out_yyy ^(1)^ |
String: message. |
IRC message sent to server after automatic split
(to fit in 512 bytes by default).
(to fit in 512 bytes by default). +
*Warning:* the string may contain invalid UTF-8 data.
Signal "xxx,irc_out1_yyy" is recommended instead.

| irc | xxx,irc_outtags_yyy ^(1)^ +
_(WeeChat ≥ 0.3.4)_ |
String: tags + ";" + message. |
Tags + IRC message sent to server.
Tags + IRC message sent to server. +
*Warning:* the string may contain invalid UTF-8 data.
Signal "xxx,irc_out1_yyy" is recommended instead.

| irc | irc_ctcp |
String: message. |
Expand Down Expand Up @@ -11214,7 +11218,10 @@ List of modifiers used by WeeChat and plugins:

| [[hook_modifier_irc_in_xxx]] irc_in_xxx ^(1)^ |
Server name |
Content of message received from IRC server (before charset decoding). |
Content of message received from IRC server (before charset decoding). +
*Warning:* the string may contain invalid UTF-8 data; use only for raw
operations on a message.
Modifier <<hook_modifier_irc_in2_xxx,irc_in2_xxx>> is recommended instead. |
New content of message.

| [[hook_modifier_irc_in2_xxx]] irc_in2_xxx ^(1)^ +
Expand Down

0 comments on commit 513f5a1

Please sign in to comment.