-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
document/publicize THINKFIRST #12813
Comments
From @rjbsA number of things broken by the new COW work were answered with, "By the way, It's true enough, but if it's going to be needed, let's get this documented so -- |
From @doyOn Mon, Feb 25, 2013 at 08:06:11AM -0800, Ricardo SIGNES wrote:
Also, the core typemap for char* (not const char*) needs to be updated -doy |
The RT System itself - Status changed from 'new' to 'open' |
From @bulk88On Mon Feb 25 08:33:29 2013, doy@tozt.net wrote:
perlxs and perlxstypemap needs to be fixed. It uses "bool_t As a separate issue, "Memory Allocation" in perlguts should mention This ticket also needs to be linked to #116569. -- |
From @LeontOn Mon, Feb 25, 2013 at 5:06 PM, Ricardo SIGNES
Father C already documented SV_THINKFIRST in a11eaec, though I think Leon |
From @iabynOn Thu, Mar 21, 2013 at 01:15:00PM +0100, Leon Timmermans wrote:
Also, Its not just adding API documentation. The xs*.pod and/or perlguts -- |
From @rjbsI have moved this from 5.18 to 5.20 blockers. It sounds like this needs more than mere documentation, and we're not going to gain much by -- |
From @cpansproutOn Tue Mar 26 13:55:32 2013, rjbs wrote:
Also, if I’m not mistaken, sv_force_normal *is* already in the API, so -- Father Chrysostomos |
From @LeontOn Tue, Mar 26, 2013 at 9:55 PM, Ricardo SIGNES via RT
But what to do about the typemap? |
From @bulk88On Tue Mar 26 17:13:22 2013, sprout wrote:
And why unconditionally call a function that isn't needed? ______________________________________________________________________ -- |
From @bulk88On Tue Mar 26 20:53:12 2013, bulk88 wrote:
^ This is what I figured out is how to make an existing SV undef-enough to ______________________________________________________________________ Some dont care about the contents of a COW SV and want to release it. In a number of places SV_CHECK_THINKFIRST_COW_DROP is followed exactly -- |
From @iabynOn Tue, Mar 26, 2013 at 05:13:22PM -0700, Father Chrysostomos via RT wrote:
Also, since the thing that triggered this discussion originally was Since SvPV_force is already API, and is already documented (since 2000 or See also my recent comments in ticket RT #116407 about improving the docs -- |
From @tonycozOn Tue Mar 26 13:55:32 2013, rjbs wrote:
Beating this dead horse which is a blocker for 5.20: What might need to be done: - neither perlxs nor perlxstut discuss modifying (or even - L<perlguts/Working with SVs> should probably discuss how to - do we need a typemap entry? I can think of two cases where a) the caller provides a fixed length buffer, in which case b) the caller provides a variable length buffer For case b) in most cases I think you'd also need the So the only case that doesn't need the length (and hence The only real way I could see doing it would be to provide our typedef unsigned char WRITABLE_BYTE; and create typemaps for WRITABLE_BYTE etc: WRITEABLE_BYTE * T_WRITEABLE_BYTE T_WRITEABLE_UTF8 T_WRITEABLE_BYTE What doesn't need to be done: - SvPV() documents that you should use SvPV_force() if you want - SvPV_force() documents that the PV string is writable after - under Copy on Write, perlguts documents that you need Hopefully that makes some sense. Tony |
From @bulk88On Tue Feb 04 20:24:36 2014, tonyc wrote:
The problem is SvPV_force_nolen and SvPVbyte_force_nolen do something totally useless if you just want to write into them. If I am going to just write, why do I want to A. copy the COW contents to a new non-COW block, B. convert encodings, then write over it? I want a uninitialized PV. sv_force_normal() copies the COW data over. Some people want R/W buffers to manipulate in place the data too. -- |
From perl5-porters@perl.orgTony Cook wrote:
The Copy on Write section added to perlguts was meant to address this. |
From @tonycozOn Wed Feb 05 01:57:24 2014, bulk88 wrote:
I think: sv_force_normal_flags(sv, SV_COW_DROP_PV); is the answer there, though the name of the flag leaves something to desire, Tony |
From @tonycozOn Tue Feb 04 20:24:36 2014, tonyc wrote:
Attached a potential patch. Tony |
From @tonycoz0001-perl-116925-discuss-modifying-an-SV-s-buffer.patchFrom 25598ee4bac31536cab2363f4bc3c70c716f1d8e Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 26 Mar 2014 16:18:08 +1100
Subject: [perl #116925] discuss modifying an SV's buffer
in the same place we otherwise talk about SVs
---
pod/perlguts.pod | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/pod/perlguts.pod b/pod/perlguts.pod
index 3fb5137..db8548a 100644
--- a/pod/perlguts.pod
+++ b/pod/perlguts.pod
@@ -159,6 +159,38 @@ decrease, the allocated memory of an SV and that it does not automatically
add space for the trailing NUL byte (perl's own string functions typically do
C<SvGROW(sv, len + 1)>).
+If you want to write to an existing SV's buffer and set its value to a
+string, use SvPV_force() or one of its variants to force the SV to be
+a PV. This will remove any of various types of non-stringness from
+the SV, including copy-on-write and being a reference. This can be
+used to implement something like read():
+
+ (void)SvPVbyte_force(sv, len);
+ s = SvGROW(sv, offset + needlen + 1);
+ /* something that modifies up to needlen bytes at s, but modifies newlen
+ eg. newlen = read(fd, s + offset. needlen);
+ ignoring errors for these examples
+ */
+ s[offset + newlen] = '\0';
+ SvCUR_set(sv, offset + newlen);
+ SvUTF8_off(sv);
+ SvSETMAGIC(sv);
+
+If you don't need the existing content of the SV, you can avoid some
+copying with:
+
+ if (SvTHINKFIRST(sv))
+ sv_force_normal_flags(sv, SV_COW_DROP_PV);
+ SvUPGRADE(sv, SVt_PV);
+ s = SvGROW(sv, needlen + 1);
+ /* something that modifies up to needlen bytes at s, but modifies newlen
+ eg. newlen = read(fd, s. needlen);
+ */
+ s[newlen] = '\0';
+ SvCUR_set(sv, newlen);
+ SvPOK_only(sv); /* also clears SVf_UTF8 */
+ SvSETMAGIC(sv);
+
If you have an SV and want to know what kind of data Perl thinks is stored
in it, you can use the following macros to check the type of SV you have.
--
1.7.10.4
|
From @rjbsI'd like to see this reviewed and (if approved) applied, but I've removed this ticket from blocking. -- |
From @bulk88On Mon Feb 25 10:38:49 2013, bulk88 wrote:
This pod issue is still a problem in blead. I was 1/4 done writing a new ticket for it before I found this old post of mine. -- |
From @bulk88On Tue Mar 25 22:19:43 2014, tonyc wrote:
+If you want to write to an existing SV's buffer and set its value to a WHy the "(void)SvPVbyte_force(sv, len);" line? Won't SvGROW drop the COW and RVs? +If you don't need the existing content of the SV, you can avoid some Maybe the API needs to be reworks. WOuldn't sv_force_normal_flags create a SVPV without a buffer already? So why the SvUPGRADE? I Kno the SvUPGRADE is protection so the SvGROW doesn't segv because its an SVIV with no body, only a head. Instread of SvUPGRADE(sv, SVt_PV); SvGROW(sv, needlen + 1); try this if(SvTYPE(sv) >= SVt_PV) { +If you want to write to an existing SV's buffer and set its value to a Is this for read buffer, then you can then write to it? Where the buffer will be read, and conditionally written to in the same C library call? If so it needs to explain better that this produces the Perl level contents for you to read and write to. And this does not supply uninit memory, but only initiated memory. Use "If you want to read and write to an string" . also mention " If you are ONLY WRITTING, see the next paragraph for how to do it faster." -- |
From @tonycozOn Wed Apr 16 02:47:11 2014, bulk88 wrote:
SvGROW() cleans those up, but not some other forms of specialness.
sv_force_normal_flags() only removes specialness, for unspecial SVs (like an undef SV) it does nothing. So replacing those 3 lines with your if statement would be incorrect. This code was based on Perl_sv_setpvn().
You're right, it needs to explain that the content is readable. As to only writing, I think the proximity with the other paragraph does that. Tony |
From @bulk88On Wed Apr 16 15:17:05 2014, tonyc wrote:
Ok, yor're right about RO flaged. Another issue, why "SvPVbyte_force" and not "SvPV_force"?
There was a misunderstanding or I wasn't clear. I'll rephrase it. Replace if (SvTHINKFIRST(sv)) with if (SvTHINKFIRST(sv)) 1 remaining issue, that I dont think POD/docs can fix is, something like
can perform redundant ascii/utf8 conversion and stringifys on a buffer that will be tossed. Here is an unnecessary stringify example with SvPV_force. void t1(SV* sv) C:\sources\>perl -MDevel::Peek=Dump -ML::XS -E"my $i; my $a = \$i; L::XS::t1($a); Dump($a); perl519.dll!Perl_sv_2pv_flags(interpreter * my_perl=0x00364e9c, sv * const sv=0x00000006, unsigned int * const lp=0x0012fc68, const long flags=0) Line 2885 C Newx(buffer, len, char); /* Working backwards */ retval -= typelen; This stringifying by SvPV_force matches its docs. /* Note that coercing an arbitrary scalar into a plain PV will potentially Then again this proposed wording \|/
is about having valid read data. I think having s = SvGROW(sv, offset + needlen + 1); after (void)SvPVbyte_force(sv, len); is strange and should be removed for the "we want read and write data" instructions. Wouldnt SvGROW have added uninit memory to the end of the readable valid data, and depending if the user/dev is bright enough realize what he did (mixing up expression "len" with "offset + needlen + 1"), tell some C library to parse/translate uninit memory at the end of the post-COW buffer? -- |
From @tonycozOn Wed, Apr 16, 2014 at 04:13:24PM -0700, bulk88 via RT wrote:
That, and the other UTF8 mentions are there to get the reader to think Using a simple SvPV_force() with my commented out example case would
I looked at your quote of my code instead of your prose, my fault.
It can't be COW at this point, sv_force_normal_flags() removes it.
I could see something like that as a new macro or function. I don't
If the caller supplies a reference or undef or something else that If they don't want any part of the the content, they can skip the The SvGROW() may end up copying that generated content, but I don't
That's a good point. I can see two cases: 1) offset <= SvCUR() - if the wrapped function does read needlen 2) offset > SvCUR() - the wrapped function is going to receive Tony |
From @tonycozOn Wed Apr 16 18:51:10 2014, tonyc wrote:
I went through some drafts trying to address this, but I think examples like So all I've added is a check that offset <= len so we don't return undefined content back to the caller. I don't think it's necessary to document here that the extra memory allocated by SvGROW() is undefined, since we don't say what it is initialized it, the caller shouldn't be making an assumptions about it. Tony |
From @tonycoz0001-perl-116925-discuss-modifying-an-SV-s-buffer.patchFrom ad5fd95262bcbe0faa8f380328c461fb88be4941 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Mon, 21 Apr 2014 11:54:08 +1000
Subject: [perl #116925] discuss modifying an SV's buffer
in the same place we otherwise talk about SVs
---
pod/perlguts.pod | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/pod/perlguts.pod b/pod/perlguts.pod
index 3fb5137..165e44e 100644
--- a/pod/perlguts.pod
+++ b/pod/perlguts.pod
@@ -159,6 +159,42 @@ decrease, the allocated memory of an SV and that it does not automatically
add space for the trailing NUL byte (perl's own string functions typically do
C<SvGROW(sv, len + 1)>).
+If you want to write to an existing SV's buffer and set its value to a
+string, use SvPV_force() or one of its variants to force the SV to be
+a PV. This will remove any of various types of non-stringness from
+the SV while preserving the content of the SV in the PV. This can be
+used to implement something like read():
+
+ (void)SvPVbyte_force(sv, len);
+ if (offset > len)
+ croak("Supplied offset beyond the end of the string");
+ s = SvGROW(sv, offset + needlen + 1);
+ /* something that modifies up to needlen bytes at s+offset, but
+ modifies newlen bytes
+ eg. newlen = read(fd, s + offset. needlen);
+ ignoring errors for these examples
+ */
+ s[offset + newlen] = '\0';
+ SvCUR_set(sv, offset + newlen);
+ SvUTF8_off(sv);
+ SvSETMAGIC(sv);
+
+If you don't need the existing content of the SV, you can avoid some
+copying with:
+
+ if (SvTHINKFIRST(sv))
+ sv_force_normal_flags(sv, SV_COW_DROP_PV);
+ SvUPGRADE(sv, SVt_PV);
+ s = SvGROW(sv, needlen + 1);
+ /* something that modifies up to needlen bytes at s, but modifies
+ newlen bytes
+ eg. newlen = read(fd, s. needlen);
+ */
+ s[newlen] = '\0';
+ SvCUR_set(sv, newlen);
+ SvPOK_only(sv); /* also clears SVf_UTF8 */
+ SvSETMAGIC(sv);
+
If you have an SV and want to know what kind of data Perl thinks is stored
in it, you can use the following macros to check the type of SV you have.
--
1.7.10.4
|
From @iabynOn Sun, Apr 20, 2014 at 07:04:29PM -0700, Tony Cook via RT wrote:
I've been reading and re-reading this thread and the two related ones My gut reaction to your proposed doc patch is uneasiness. I don't like the My current understanding is that there are four types of string buffer * get-only This is already covered in perlguts: basically it's SvPV() and Perl-level equivalent: stringification: e.g. "$var". * get-and-set This is where you modify an existing value by directly modifying the Perl-level equivalent: e.g. substr($var, 1, 3, "abcd"); Perhaps a simple string prepend would make a clearer example? I think here we should should also point out that the various sv_cat*() * set-only This is where we don't care about the old value; we just want to set Perl-level equivalent: e.g. $var = "new string"; This is where it seems to me, that most of the difficultly lies. This is a special case of get-and-set, only we're trying to be more My feeling here is that most of the time people should just use len = read(fh, buf, 1024); sv_setpvn() will handle all the de-COWing etc, in an efficient manner. In those cases where you just want to allocate a particular sized sv_setpvn(sv,"",0); does all the sv_force_normal_flags(), SvUPGRADE() etc stuff for you at * replace buffer This is a particular use-case that Leon T mentioned, where you free(SvPVX(sv)); but safely. I could be wrong, but doesn't sv_usepvn_flags () and similar do this? -- |
From @LeontOn Tue, May 6, 2014 at 4:00 PM, Dave Mitchell <davem@iabyn.com> wrote:
I agree it's not optimal, but that may be inherent to the issue.
I would welcome new API functions that would make this easier.
sv_usepvn_flags assumed the buffer is malloc()ed, while in my use-case it's Leon |
From @bulk88On Tue May 06 07:01:16 2014, davem wrote:
That usually results in a redundant double copy, unless the * came from a foreign allocator. Direct writing into the PV is always better.
true
Yes, but only for Newx blocks and only if SV_HAS_TRAILING_NUL is used. Never pass a non-Newx (AKA malloc) ptr to sv_usepvn_flags. On Tue May 06 15:42:16 2014, LeonT wrote:
See paragraph above. -- |
From @tonycozOn Tue May 06 07:01:16 2014, davem wrote:
I'm not fond of using read() in the first example, though it's based on a simple implementation of perl's read() function. I'm using read() as the function since I believe (hope!) that most of the people digging this deep into guts to make their code efficient understand the behaviour of the POSIX read() function. Perhaps I should be doing a simple appended read() instead, since it avoid the extra parameter and some of the checks.
...
If you're doing a string prepend, you should probably using sv_insert(), which I should probably mention. I've simplified the get-and-set to an append, and added notes about sv_cat*() and sv_insert().
I've added a reference to sv_setpvn().
I've changed the example to use that.
I've added a brief discussion of that. Tony |
From @tonycoz0001-perl-116925-discuss-modifying-an-SV-s-buffer.patchFrom 0751e326fb39ab12055583509812e6047ce20e1e Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Tue, 27 May 2014 12:01:14 +1000
Subject: [perl #116925] discuss modifying an SV's buffer
in the same place we otherwise talk about SVs
---
pod/perlguts.pod | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
diff --git a/pod/perlguts.pod b/pod/perlguts.pod
index 3fb5137..cf3df89 100644
--- a/pod/perlguts.pod
+++ b/pod/perlguts.pod
@@ -159,6 +159,58 @@ decrease, the allocated memory of an SV and that it does not automatically
add space for the trailing NUL byte (perl's own string functions typically do
C<SvGROW(sv, len + 1)>).
+If you want to write to an existing SV's buffer and set its value to a
+string, use SvPV_force() or one of its variants to force the SV to be
+a PV. This will remove any of various types of non-stringness from
+the SV while preserving the content of the SV in the PV. This can be
+used, for example, to append data from an API function to a buffer
+without extra copying:
+
+ (void)SvPVbyte_force(sv, len);
+ s = SvGROW(sv, len + needlen + 1);
+ /* something that modifies up to needlen bytes at s+len, but
+ modifies newlen bytes
+ eg. newlen = read(fd, s + len, needlen);
+ ignoring errors for these examples
+ */
+ s[len + newlen] = '\0';
+ SvCUR_set(sv, len + newlen);
+ SvUTF8_off(sv);
+ SvSETMAGIC(sv);
+
+If you already have the data in memory or if you want to keep your
+code simple, you can use one of the sv_cat*() variants, such as
+sv_catpvn(). If you want to insert anywhere in the string you can use
+sv_insert() or sv_insert_flags().
+
+If you don't need the existing content of the SV, you can avoid some
+copying with:
+
+ sv_setpvn(sv, "", 0);
+ s = SvGROW(sv, needlen + 1);
+ /* something that modifies up to needlen bytes at s, but modifies
+ newlen bytes
+ eg. newlen = read(fd, s. needlen);
+ */
+ s[newlen] = '\0';
+ SvCUR_set(sv, newlen);
+ SvPOK_only(sv); /* also clears SVf_UTF8 */
+ SvSETMAGIC(sv);
+
+Again, if you already have the data in memory or want to avoid the
+complexity of the above, you can use sv_setpvn().
+
+If you have a buffer allocated with Newx() and want to set that as the
+SV's value, you can use sv_usepvn_flags(). That has some requirements
+if you want to avoid perl re-allocating the buffer to fit the trailing
+NUL:
+
+ Newx(buf, somesize+1, char);
+ /* ... fill in buf ... */
+ buf[somesize] = '\0';
+ sv_usepvn_flags(sv, buf, somesize, SV_SMAGIC | SV_HAS_TRAILING_NUL);
+ /* buf now belongs to perl, don't release it */
+
If you have an SV and want to know what kind of data Perl thinks is stored
in it, you can use the following macros to check the type of SV you have.
--
1.7.10.4
|
@tonycoz - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#116925 (status was 'resolved')
Searchable as RT116925$
The text was updated successfully, but these errors were encountered: