Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing SvPV* utf8/byte nomg macro variants #16392

Closed
p5pRT opened this issue Jan 29, 2018 · 55 comments
Closed

Missing SvPV* utf8/byte nomg macro variants #16392

p5pRT opened this issue Jan 29, 2018 · 55 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Jan 29, 2018

Migrated from rt.perl.org#132782 (status was 'pending release')

Searchable as RT132782$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 29, 2018

From @pali

Hi! Currently in perl there are missing SvPVutf8_nomg and SvPVbyte_nomg
macros, equivalent of SvPVutf8 and SvPVbyte, but without processing get
magic. To write XS module correctly without being affected by Perl's The
Unicode Bug, it is easier to use SvPVutf8 resp. SvPVbyte macros instead
of combination of SvPV + SvUTF8 with manual converting Latin1 to utf8.
But if it is needed to distinguish between undef and string in function
implemented in XS, then it SvPVutf8 cannot be used as it throw warning
in case scalar is undef. I think that it is common requirement to
support API undef or string, therefore SvPVutf8_nomg would be really
useful.

Currently for API which accepts undef or string is required something
like this​:

void
function(arg)
  SV *arg
PREINIT​:
  SV *tmp;
  char *str;
  STRLEN len;
INIT​:
  SvGETMAGIC(arg);
CODE​:
  if (SvOK(arg)) {
  str = SvPV_nomg(arg, len);
  if (!SvUTF8(arg) {
  if (SvGMAGICAL(arg))
  tmp = sv_2mortal(newSVpvn(str, len));
  else
  tmp = arg;
  str = SvPVutf8(tmp, len);
  }
  } else {
  str = NULL;
  len = 0;
  }
... now str/len contains either NULL or utf8 representation of arg ...

Which is really non-intuitive and hard to write from scratch for novice
as there is fully missing such (very common) example in any perl
documentation.

With SvPVutf8_nomg it would reduce code just to​:

void
function(arg)
  SV *arg
PREINIT​:
  char *str;
  STRLEN len;
INIT​:
  SvGETMAGIC(arg);
CODE​:
  if (SvOK(arg)) {
  str = SvPVutf8_nomg(arg, len);
  } else {
  str = NULL;
  len = 0;
  }
... now str/len contains either NULL or utf8 representation of arg ...

Maybe some SvPV* macro which would return NULL without warning for
undefined value can be useful too to even more simplify that code.

Also, perlapi documentation should suggest to use SvPVutf8 (reps.
SvPVbyte) function instead of SvPV as without processing SvUTF8() check,
such code is affected by the Perl's Unicode Bug.

Also, to prevent processing get magic more times, it is needed to call
get magic only once in XS function, so ideally with SvGETMAGIC() and
then using only *_nomg functions/macros.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 31, 2018

From @khwilliamson

Patches welcome :)
--
Karl Williamson

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 31, 2018

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 6, 2018

From @pali

On Wednesday 31 January 2018 06​:48​:21 Karl Williamson via RT wrote​:

Patches welcome :)

What about something like this?

#ifndef SvPVutf8_nomg
PERL_STATIC_INLINE char * SvPVutf8_nomg(pTHX_ SV *sv, STRLEN *len)
{
  char *buf = SvPV_nomg(sv, *len);
  if (SvUTF8(sv))
  return buf;
  if (SvGMAGICAL(sv))
  sv = sv_2mortal(newSVpvn(buf, *len));
  /* There is sv_utf8_upgrade_nomg(), but it is broken prior to Perl version 5.13.10 */
  return SvPVutf8(sv, *len);
}
#define SvPVutf8_nomg(sv, len) SvPVutf8_nomg(aTHX_ (sv), &(len))
#endif

#ifndef SvPVbyte_nomg
PERL_STATIC_INLINE char * SvPVbyte_nomg(pTHX_ SV *sv, STRLEN *len)
{
  char *buf = SvPV_nomg(sv, *len);
  if (!SvUTF8(sv))
  return buf;
  if (SvGMAGICAL(sv))
  {
  sv = sv_2mortal(newSVpvn(buf, *len));
  SvUTF8_on(sv);
  }
  return SvPVbyte(sv, *len);
}
#define SvPVbyte_nomg(sv, len) SvPVbyte_nomg(aTHX_ (sv), &(len))
#endif

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 11, 2018

From @pali

In attachment are RFC patches for new functions/macros​:

sv_utf8_downgrade_flags()
sv_utf8_downgrade_nomg()

sv_2pvbyte_flags()
sv_2pvutf8_flags()

SvPVutf8_nomg()
SvPVbyte_nomg()

SvPV_or_null()
SvPV_or_null_nomg()

SvPVutf8_or_null()
SvPVutf8_or_null_nomg()

SvPVbyte_or_null()
SvPVbyte_or_null_nomg()

What about them?

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 11, 2018

From @pali

0001-Implement-sv_utf8_downgrade_nomg.patch
From a2b03965498f57448e06d54a1ef4e26a7302593c Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [RFC PATCH 1/4] Implement sv_utf8_downgrade_nomg

---
 embed.h | 2 +-
 proto.h | 2 +-
 sv.c    | 9 ++++++---
 sv.h    | 2 ++
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/embed.h b/embed.h
index d1fe34ab66..7df5dfdd52 100644
--- a/embed.h
+++ b/embed.h
@@ -703,7 +703,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #define sv_uv(a)		Perl_sv_uv(aTHX_ a)
diff --git a/proto.h b/proto.h
index 485211540b..e6b52c925e 100644
--- a/proto.h
+++ b/proto.h
@@ -3466,7 +3466,7 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
-PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
 #define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
 	assert(sv)
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
diff --git a/sv.c b/sv.c
index fa5295d4ba..6937222aae 100644
--- a/sv.c
+++ b/sv.c
@@ -3598,7 +3598,7 @@ use the C<Encode> extension for that.
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
 
@@ -3606,7 +3606,10 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+	    int mg_flags = 0;
+
+            if (flags & SV_GMAGIC)
+                mg_flags = SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3616,7 +3619,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index 1c7224277f..05aa924f76 100644
--- a/sv.h
+++ b/sv.h
@@ -1919,6 +1919,8 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 11, 2018

From @pali

0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From 58f6ccda8f161c8ace9aabed4305ba9e3580fe3f Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [RFC PATCH 2/4] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 7cc227faac..399a573d7a 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -934,10 +934,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 11, 2018

From @pali

0003-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From c5cbcce771edadf1fc261e4e0f2de8c2e6482df6 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [RFC PATCH 3/4] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.h |  4 ++--
 proto.h |  4 ++--
 sv.c    | 20 ++++++++++++--------
 sv.h    | 10 ++++++++++
 4 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/embed.h b/embed.h
index 7df5dfdd52..a6a3be73f1 100644
--- a/embed.h
+++ b/embed.h
@@ -588,8 +588,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)		Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)		Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/proto.h b/proto.h
index e6b52c925e..9f426debf7 100644
--- a/proto.h
+++ b/proto.h
@@ -3011,7 +3011,7 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
-PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
 #define PERL_ARGS_ASSERT_SV_2PVBYTE	\
 	assert(sv)
 #ifndef NO_MATHOMS
@@ -3021,7 +3021,7 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
-PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
 #define PERL_ARGS_ASSERT_SV_2PVUTF8	\
 	assert(sv)
 #ifndef NO_MATHOMS
diff --git a/sv.c b/sv.c
index 6937222aae..6f297e600e 100644
--- a/sv.c
+++ b/sv.c
@@ -3268,18 +3268,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_2PVBYTE;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3295,15 +3296,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_2PVUTF8;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index 05aa924f76..bf4ab9bb5c 100644
--- a/sv.h
+++ b/sv.h
@@ -1730,6 +1730,10 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1744,6 +1748,10 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1935,7 +1943,9 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 11, 2018

From @pali

0004-Implement-SvPV-_or_null.patch
From 8e9bc41cd5a3176442023d67cf90ead64e3f9016 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [RFC PATCH 4/4] Implement SvPV*_or_null*

---
 sv.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/sv.h b/sv.h
index bf4ab9bb5c..8199f572d5 100644
--- a/sv.h
+++ b/sv.h
@@ -1670,6 +1670,7 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
 #define SvPV(sv, lp)         SvPV_flags(sv, lp, SV_GMAGIC)
 #define SvPV_const(sv, lp)   SvPV_flags_const(sv, lp, SV_GMAGIC)
 #define SvPV_mutable(sv, lp) SvPV_flags_mutable(sv, lp, SV_GMAGIC)
+#define SvPV_or_null(sv, lp) (SvGETMAGIC(sv), SvPV_or_null_nomg(sv, lp))
 
 #define SvPV_flags(sv, lp, flags) \
     (SvPOK_nog(sv) \
@@ -1723,6 +1724,7 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
 #define SvPV_nomg(sv, lp) SvPV_flags(sv, lp, 0)
 #define SvPV_nomg_const(sv, lp) SvPV_flags_const(sv, lp, 0)
 #define SvPV_nomg_const_nolen(sv) SvPV_flags_const_nolen(sv, 0)
+#define SvPV_or_null_nomg(sv, lp) (SvOK(sv) ? SvPV_nomg(sv, lp) : ((lp = 0), NULL))
 
 /* ----*/
 
@@ -1730,10 +1732,20 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1748,10 +1760,20 @@ Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Aug 9, 2018

From @pali

On Sunday 11 February 2018 12​:44​:31 pali@​cpan.org wrote​:

In attachment are RFC patches for new functions/macros​:

sv_utf8_downgrade_flags()
sv_utf8_downgrade_nomg()

sv_2pvbyte_flags()
sv_2pvutf8_flags()

SvPVutf8_nomg()
SvPVbyte_nomg()

SvPV_or_null()
SvPV_or_null_nomg()

SvPVutf8_or_null()
SvPVutf8_or_null_nomg()

SvPVbyte_or_null()
SvPVbyte_or_null_nomg()

What about them?

Hello, can somebody comment / review these patches?

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 13, 2018

From @pali

In attachment are rebased patches on current blead.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 13, 2018

From @pali

v2-0001-Implement-sv_utf8_downgrade_nomg.patch
From c91c1eab30be09f24654805f4472dd11f7b26787 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [PATCH 1/4] Implement sv_utf8_downgrade_nomg

---
 embed.h | 2 +-
 proto.h | 2 +-
 sv.c    | 9 ++++++---
 sv.h    | 2 ++
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/embed.h b/embed.h
index 8e762bc248..c82799b75b 100644
--- a/embed.h
+++ b/embed.h
@@ -850,7 +850,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #ifndef NO_MATHOMS
diff --git a/proto.h b/proto.h
index 0e6c76a3bc..5bb7194698 100644
--- a/proto.h
+++ b/proto.h
@@ -3652,7 +3652,7 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
-PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
 #define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
 	assert(sv)
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
diff --git a/sv.c b/sv.c
index 77f63183f7..abb1ed4a10 100644
--- a/sv.c
+++ b/sv.c
@@ -3600,7 +3600,7 @@ use the C<Encode> extension for that.
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
 
@@ -3608,7 +3608,10 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+	    int mg_flags = 0;
+
+            if (flags & SV_GMAGIC)
+                mg_flags = SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3618,7 +3621,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index f3392b08ec..4bf948f6d7 100644
--- a/sv.h
+++ b/sv.h
@@ -1915,6 +1915,8 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 13, 2018

From @pali

v2-0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From c7ee9830d6a45c7ecaa8a38d97f9da287d7ad2f1 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [PATCH 2/4] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 54e35f10a6..febff6e217 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -934,10 +934,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 13, 2018

From @pali

v2-0003-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From 6a70a3ac23f50777aae98f0b0c938fcfa45f33a8 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [PATCH 3/4] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.h |  4 ++--
 proto.h |  4 ++--
 sv.c    | 20 ++++++++++++--------
 sv.h    | 10 ++++++++++
 4 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/embed.h b/embed.h
index c82799b75b..1fb183561e 100644
--- a/embed.h
+++ b/embed.h
@@ -723,8 +723,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)		Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)		Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/proto.h b/proto.h
index 5bb7194698..eeb496ba5f 100644
--- a/proto.h
+++ b/proto.h
@@ -3183,7 +3183,7 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
-PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
 #define PERL_ARGS_ASSERT_SV_2PVBYTE	\
 	assert(sv)
 #ifndef NO_MATHOMS
@@ -3193,7 +3193,7 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
-PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
 #define PERL_ARGS_ASSERT_SV_2PVUTF8	\
 	assert(sv)
 #ifndef NO_MATHOMS
diff --git a/sv.c b/sv.c
index abb1ed4a10..c81f23c143 100644
--- a/sv.c
+++ b/sv.c
@@ -3270,18 +3270,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_2PVBYTE;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3297,15 +3298,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_2PVUTF8;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index 4bf948f6d7..8f60c3f538 100644
--- a/sv.h
+++ b/sv.h
@@ -1726,6 +1726,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1740,6 +1744,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1931,7 +1939,9 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 13, 2018

From @pali

v2-0004-Implement-SvPV-_or_null.patch
From 807b85a3d7681382a37c85987c52cdb3a48d55c6 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [PATCH 4/4] Implement SvPV*_or_null*

---
 sv.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/sv.h b/sv.h
index 8f60c3f538..c5fd59b217 100644
--- a/sv.h
+++ b/sv.h
@@ -1726,10 +1726,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1744,10 +1754,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 14, 2018

From @tonycoz

On Thu, 13 Sep 2018 01​:08​:29 -0700, pali@​cpan.org wrote​:

In attachment are rebased patches on current blead.

embed.h is a generated file, modify embed.fnc and then regenerate embed.fnc with make regen.

Tony

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 25, 2018

From @pali

On Thursday 13 September 2018 18​:18​:57 Tony Cook via RT wrote​:

embed.h is a generated file, modify embed.fnc and then regenerate embed.fnc with make regen.

Fixed. New patches are attached.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 25, 2018

From @pali

v3-0001-Implement-sv_utf8_downgrade_nomg.patch
From 19919d1148789e45def7c7d0fa14a43fee6e2933 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [PATCH 1/4] Implement sv_utf8_downgrade_nomg

---
 embed.fnc | 3 ++-
 embed.h   | 2 +-
 proto.h   | 5 +++++
 sv.c      | 9 ++++++---
 sv.h      | 2 ++
 5 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 7bffdc67ea..5f47efa437 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2051,7 +2051,8 @@ ApmdbR	|char*	|sv_pvutf8	|NN SV *sv
 ApmdbR	|char*	|sv_pvbyte	|NN SV *sv
 Apmdb	|STRLEN	|sv_utf8_upgrade|NN SV *sv
 Amd	|STRLEN	|sv_utf8_upgrade_nomg|NN SV *sv
-ApdM	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+ApdMmb	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+ApdM	|bool	|sv_utf8_downgrade_flags|NN SV *const sv|const bool fail_ok|const U32 flags
 Apd	|void	|sv_utf8_encode |NN SV *const sv
 ApdM	|bool	|sv_utf8_decode |NN SV *const sv
 Apdmb	|void	|sv_force_normal|NN SV *sv
diff --git a/embed.h b/embed.h
index 8e762bc248..c82799b75b 100644
--- a/embed.h
+++ b/embed.h
@@ -850,7 +850,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #ifndef NO_MATHOMS
diff --git a/proto.h b/proto.h
index 0e6c76a3bc..e0e22edff2 100644
--- a/proto.h
+++ b/proto.h
@@ -3652,9 +3652,14 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
+#ifndef NO_MATHOMS
 PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
 #define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
 	assert(sv)
+#endif
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS	\
+	assert(sv)
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_ENCODE	\
 	assert(sv)
diff --git a/sv.c b/sv.c
index 77f63183f7..abb1ed4a10 100644
--- a/sv.c
+++ b/sv.c
@@ -3600,7 +3600,7 @@ use the C<Encode> extension for that.
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
 
@@ -3608,7 +3608,10 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+	    int mg_flags = 0;
+
+            if (flags & SV_GMAGIC)
+                mg_flags = SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3618,7 +3621,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index f3392b08ec..4bf948f6d7 100644
--- a/sv.h
+++ b/sv.h
@@ -1915,6 +1915,8 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 25, 2018

From @pali

v3-0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From 85ef5d3ab3d1e64712c8bc384bfdc81dd5205658 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [PATCH 2/4] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 54e35f10a6..febff6e217 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -934,10 +934,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 25, 2018

From @pali

v3-0003-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From 08bc280375e898e1d909e9f30c70f45b215c10c2 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [PATCH 3/4] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.fnc |  6 ++++--
 embed.h   |  4 ++--
 proto.h   | 10 ++++++++++
 sv.c      | 20 ++++++++++++--------
 sv.h      | 10 ++++++++++
 5 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 5f47efa437..e35717d78d 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1539,8 +1539,10 @@ Apd	|NV	|sv_2nv_flags	|NN SV *const sv|const I32 flags
 pMd	|SV*	|sv_2num	|NN SV *const sv
 Apmb	|char*	|sv_2pv		|NN SV *sv|NULLOK STRLEN *lp
 Apd	|char*	|sv_2pv_flags	|NN SV *const sv|NULLOK STRLEN *const lp|const I32 flags
-Apd	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
-Apd	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apdmb	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
+Apd	|char*	|sv_2pvutf8_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
+Apdmb	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apd	|char*	|sv_2pvbyte_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
 Abp	|char*	|sv_pvn_nomg	|NN SV* sv|NULLOK STRLEN* lp
 Apmb	|UV	|sv_2uv		|NN SV *sv
 Apd	|UV	|sv_2uv_flags	|NN SV *const sv|const I32 flags
diff --git a/embed.h b/embed.h
index c82799b75b..4844f6dbbb 100644
--- a/embed.h
+++ b/embed.h
@@ -723,8 +723,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)	Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)	Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/proto.h b/proto.h
index e0e22edff2..49a952897f 100644
--- a/proto.h
+++ b/proto.h
@@ -3183,9 +3183,14 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVBYTE	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
@@ -3193,9 +3198,14 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVUTF8	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
diff --git a/sv.c b/sv.c
index abb1ed4a10..c81f23c143 100644
--- a/sv.c
+++ b/sv.c
@@ -3270,18 +3270,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_2PVBYTE;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3297,15 +3298,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_2PVUTF8;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index 4bf948f6d7..8f60c3f538 100644
--- a/sv.h
+++ b/sv.h
@@ -1726,6 +1726,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1740,6 +1744,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1931,7 +1939,9 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 25, 2018

From @pali

v3-0004-Implement-SvPV-_or_null.patch
From 3799aa59f8e1c48411739d236201c87c937d0649 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [PATCH 4/4] Implement SvPV*_or_null*

---
 sv.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/sv.h b/sv.h
index 8f60c3f538..c5fd59b217 100644
--- a/sv.h
+++ b/sv.h
@@ -1726,10 +1726,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1744,10 +1754,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2018

From @pali

In attachment is a new version of patches.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2018

From @pali

v4-0001-Implement-sv_utf8_downgrade_nomg.patch
From 4cd6d91dfb0c63abbad832dd032880151178f107 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [PATCH 1/4] Implement sv_utf8_downgrade_nomg

---
 embed.fnc |  3 ++-
 embed.h   |  2 +-
 proto.h   |  5 +++--
 sv.c      | 11 +++++++----
 sv.h      |  2 ++
 5 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 7bffdc6..ed1ab8a 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2051,7 +2051,8 @@ ApmdbR	|char*	|sv_pvutf8	|NN SV *sv
 ApmdbR	|char*	|sv_pvbyte	|NN SV *sv
 Apmdb	|STRLEN	|sv_utf8_upgrade|NN SV *sv
 Amd	|STRLEN	|sv_utf8_upgrade_nomg|NN SV *sv
-ApdM	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+ApdMm	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+ApdM	|bool	|sv_utf8_downgrade_flags|NN SV *const sv|const bool fail_ok|const U32 flags
 Apd	|void	|sv_utf8_encode |NN SV *const sv
 ApdM	|bool	|sv_utf8_decode |NN SV *const sv
 Apdmb	|void	|sv_force_normal|NN SV *sv
diff --git a/embed.h b/embed.h
index 8e762bc..c82799b 100644
--- a/embed.h
+++ b/embed.h
@@ -850,7 +850,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #ifndef NO_MATHOMS
diff --git a/proto.h b/proto.h
index 0e6c76a..b2a68f5 100644
--- a/proto.h
+++ b/proto.h
@@ -3652,8 +3652,9 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
-PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
-#define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
+/* PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok); */
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS	\
 	assert(sv)
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_ENCODE	\
diff --git a/sv.c b/sv.c
index 77f6318..e8c372f 100644
--- a/sv.c
+++ b/sv.c
@@ -3600,15 +3600,18 @@ use the C<Encode> extension for that.
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
+    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS;
 
     if (SvPOKp(sv) && SvUTF8(sv)) {
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+	    int mg_flags = 0;
+
+            if (flags & SV_GMAGIC)
+                mg_flags = SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3618,7 +3621,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index f3392b0..4bf948f 100644
--- a/sv.h
+++ b/sv.h
@@ -1915,6 +1915,8 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2018

From @pali

v4-0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From 4abad462f11d69b222a6ab858d25163797ea67e1 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [PATCH 2/4] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 54e35f1..febff6e 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -934,10 +934,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2018

From @pali

v4-0003-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From e1589f03171ded795a69cf97b12c005db9131d11 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [PATCH 3/4] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.fnc |  6 ++++--
 embed.h   |  4 ++--
 proto.h   | 10 ++++++----
 sv.c      | 24 ++++++++++++++----------
 sv.h      | 10 ++++++++++
 5 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index ed1ab8a..d37687f 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1539,8 +1539,10 @@ Apd	|NV	|sv_2nv_flags	|NN SV *const sv|const I32 flags
 pMd	|SV*	|sv_2num	|NN SV *const sv
 Apmb	|char*	|sv_2pv		|NN SV *sv|NULLOK STRLEN *lp
 Apd	|char*	|sv_2pv_flags	|NN SV *const sv|NULLOK STRLEN *const lp|const I32 flags
-Apd	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
-Apd	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apdm	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
+Apd	|char*	|sv_2pvutf8_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
+Apdm	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apd	|char*	|sv_2pvbyte_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
 Abp	|char*	|sv_pvn_nomg	|NN SV* sv|NULLOK STRLEN* lp
 Apmb	|UV	|sv_2uv		|NN SV *sv
 Apd	|UV	|sv_2uv_flags	|NN SV *const sv|const I32 flags
diff --git a/embed.h b/embed.h
index c82799b..4844f6d 100644
--- a/embed.h
+++ b/embed.h
@@ -723,8 +723,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)	Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)	Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/proto.h b/proto.h
index b2a68f5..8b49e25 100644
--- a/proto.h
+++ b/proto.h
@@ -3183,8 +3183,9 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
-PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
-#define PERL_ARGS_ASSERT_SV_2PVBYTE	\
+/* PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp); */
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS	\
 	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
@@ -3193,8 +3194,9 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
-PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
-#define PERL_ARGS_ASSERT_SV_2PVUTF8	\
+/* PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp); */
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS	\
 	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8_nolen(pTHX_ SV* sv)
diff --git a/sv.c b/sv.c
index e8c372f..86c39e0 100644
--- a/sv.c
+++ b/sv.c
@@ -3270,18 +3270,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVBYTE;
+    PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3297,15 +3298,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVUTF8;
+    PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index 4bf948f..8f60c3f 100644
--- a/sv.h
+++ b/sv.h
@@ -1726,6 +1726,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1740,6 +1744,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1931,7 +1939,9 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2018

From @pali

v4-0004-Implement-SvPV-_or_null.patch
From bf8013240858c9142cb9ad23c1ce40e29940f881 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [PATCH 4/4] Implement SvPV*_or_null*

---
 sv.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/sv.h b/sv.h
index 8f60c3f..c5fd59b 100644
--- a/sv.h
+++ b/sv.h
@@ -1726,10 +1726,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1744,10 +1754,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Oct 3, 2018

From @tonycoz

On Wed, 26 Sep 2018 08​:59​:32 -0700, pali@​cpan.org wrote​:

In attachment is a new version of patches.

You originally posted these as RFC, so I won't apply them right now, so some comments​:

a) you modify do_vecset() and do_vecget() presumably to reduce the number of magic calls they make, but you don't test for it (see t/op/bop.t for some examples.)

b) your second patch actually fixes a bug with vecset() on magic scalars, for example, with the following in bop.t​:

my $zstr = "d";
utf8​::upgrade($zstr);
tie $z, "main", $zstr;
is((vec($z, 0, 1) = 1), 1, "vecset (trivial utf8)");
is(fetches($z), 1, "vecset fetches (trivial utf8)");
is(stores($z), 1, "vecset stores (trivial utf8)");

the vecset() writes to freed memory (the extra magic calls allocate a new PV, but the code uses the old PV.)[1]

c) None of your new macros/functions have documentation

d) I'm not sure the _or_null macros add much.

Tony

[1]

=================================================================
==17395==ERROR​: AddressSanitizer​: heap-use-after-free on address 0x60200004a530 at pc 0x5625ea1c3b7c bp 0x7ffc232fe100 sp 0x7ffc232fe0f8
READ of size 1 at 0x60200004a530 thread T0
  #0 0x5625ea1c3b7b in Perl_do_vecset /home/tony/dev/perl/git/perl/doop.c​:974
  #1 0x5625e9e7e4b6 in Perl_magic_setvec /home/tony/dev/perl/git/perl/mg.c​:2422
  #2 0x5625e9e63e7a in Perl_mg_set /home/tony/dev/perl/git/perl/mg.c​:296
  #3 0x5625e9efc26b in Perl_pp_sassign /home/tony/dev/perl/git/perl/pp_hot.c​:226
  #4 0x5625e9e371c0 in Perl_runops_debug /home/tony/dev/perl/git/perl/dump.c​:2536
  #5 0x5625e9b89527 in S_run_body /home/tony/dev/perl/git/perl/perl.c​:2688
  #6 0x5625e9b877d9 in perl_run /home/tony/dev/perl/git/perl/perl.c​:2611
  #7 0x5625e9acecbe in main /home/tony/dev/perl/git/perl/perlmain.c​:122
  #8 0x7f191a7352e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
  #9 0x5625e9ace9f9 in _start (/home/tony/dev/perl/git/perl/perl+0x1479f9)

0x60200004a530 is located 0 bytes inside of 10-byte region [0x60200004a530,0x60200004a53a)
freed by thread T0 here​:
  #0 0x7f191b8eda10 in free (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1a10)
  #1 0x5625e9e3af90 in Perl_safesysfree /home/tony/dev/perl/git/perl/util.c​:385
  #2 0x5625e9fa9342 in Perl_sv_setsv_flags /home/tony/dev/perl/git/perl/sv.c​:4568
  #3 0x5625e9e77946 in S_magic_methpack /home/tony/dev/perl/git/perl/mg.c​:1982
  #4 0x5625e9e77ad5 in Perl_magic_getpack /home/tony/dev/perl/git/perl/mg.c​:1993
  #5 0x5625e9e62f2a in Perl_mg_get /home/tony/dev/perl/git/perl/mg.c​:201
  #6 0x5625e9f77d99 in Perl_sv_2pv_flags /home/tony/dev/perl/git/perl/sv.c​:2948
  #7 0x5625e9f88321 in Perl_sv_utf8_downgrade /home/tony/dev/perl/git/perl/sv.c​:3628
  #8 0x5625ea1c2c44 in Perl_do_vecset /home/tony/dev/perl/git/perl/doop.c​:940
  #9 0x5625e9e7e4b6 in Perl_magic_setvec /home/tony/dev/perl/git/perl/mg.c​:2422
  #10 0x5625e9e63e7a in Perl_mg_set /home/tony/dev/perl/git/perl/mg.c​:296
  #11 0x5625e9efc26b in Perl_pp_sassign /home/tony/dev/perl/git/perl/pp_hot.c​:226
  #12 0x5625e9e371c0 in Perl_runops_debug /home/tony/dev/perl/git/perl/dump.c​:2536
  #13 0x5625e9b89527 in S_run_body /home/tony/dev/perl/git/perl/perl.c​:2688
  #14 0x5625e9b877d9 in perl_run /home/tony/dev/perl/git/perl/perl.c​:2611
  #15 0x5625e9acecbe in main /home/tony/dev/perl/git/perl/perlmain.c​:122
  #16 0x7f191a7352e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)

previously allocated by thread T0 here​:
  #0 0x7f191b8edd28 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1d28)
  #1 0x5625e9e3a8a7 in Perl_safesysmalloc /home/tony/dev/perl/git/perl/util.c​:153
  #2 0x5625e9f5ebc8 in Perl_sv_grow /home/tony/dev/perl/git/perl/sv.c​:1600
  #3 0x5625e9face52 in Perl_sv_setsv_flags /home/tony/dev/perl/git/perl/sv.c​:4643
  #4 0x5625e9f3cce4 in Perl_leave_adjust_stacks /home/tony/dev/perl/git/perl/pp_hot.c​:4826
  #5 0x5625e9f3d479 in Perl_pp_leavesub /home/tony/dev/perl/git/perl/pp_hot.c​:4902
  #6 0x5625e9e371c0 in Perl_runops_debug /home/tony/dev/perl/git/perl/dump.c​:2536
  #7 0x5625e9b8c2a6 in Perl_call_sv /home/tony/dev/perl/git/perl/perl.c​:2998
  #8 0x5625e9e76f9b in Perl_magic_methcall /home/tony/dev/perl/git/perl/mg.c​:1936
  #9 0x5625e9e77831 in S_magic_methcall1 /home/tony/dev/perl/git/perl/mg.c​:1968
  #10 0x5625e9e77923 in S_magic_methpack /home/tony/dev/perl/git/perl/mg.c​:1980
  #11 0x5625e9e77ad5 in Perl_magic_getpack /home/tony/dev/perl/git/perl/mg.c​:1993
  #12 0x5625e9e62f2a in Perl_mg_get /home/tony/dev/perl/git/perl/mg.c​:201
  #13 0x5625ea004bf6 in Perl_sv_pvn_force_flags /home/tony/dev/perl/git/perl/sv.c​:10015
  #14 0x5625ea1c2bd8 in Perl_do_vecset /home/tony/dev/perl/git/perl/doop.c​:933
  #15 0x5625e9e7e4b6 in Perl_magic_setvec /home/tony/dev/perl/git/perl/mg.c​:2422
  #16 0x5625e9e63e7a in Perl_mg_set /home/tony/dev/perl/git/perl/mg.c​:296
  #17 0x5625e9efc26b in Perl_pp_sassign /home/tony/dev/perl/git/perl/pp_hot.c​:226
  #18 0x5625e9e371c0 in Perl_runops_debug /home/tony/dev/perl/git/perl/dump.c​:2536
  #19 0x5625e9b89527 in S_run_body /home/tony/dev/perl/git/perl/perl.c​:2688
  #20 0x5625e9b877d9 in perl_run /home/tony/dev/perl/git/perl/perl.c​:2611
  #21 0x5625e9acecbe in main /home/tony/dev/perl/git/perl/perlmain.c​:122
  #22 0x7f191a7352e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)

SUMMARY​: AddressSanitizer​: heap-use-after-free /home/tony/dev/perl/git/perl/doop.c​:974 in Perl_do_vecset
Shadow bytes around the buggy address​:
  0x0c0480001450​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001460​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001470​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001480​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001490​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c04800014a0​: fa fa 00 02 fa fa[fd]fd fa fa fd fd fa fa fd fd
  0x0c04800014b0​: fa fa fd fa fa fa fd fd fa fa fd fd fa fa fd fd
  0x0c04800014c0​: fa fa fd fd fa fa fd fa fa fa 00 00 fa fa fd fd
  0x0c04800014d0​: fa fa fd fa fa fa fd fa fa fa fd fd fa fa fd fd
  0x0c04800014e0​: fa fa fd fd fa fa fd fd fa fa fd fd fa fa fd fd
  0x0c04800014f0​: fa fa 00 02 fa fa fd fd fa fa fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes)​:
  Addressable​: 00
  Partially addressable​: 01 02 03 04 05 06 07
  Heap left redzone​: fa
  Heap right redzone​: fb
  Freed heap region​: fd
  Stack left redzone​: f1
  Stack mid redzone​: f2
  Stack right redzone​: f3
  Stack partial redzone​: f4
  Stack after return​: f5
  Stack use after scope​: f8
  Global redzone​: f9
  Global init order​: f6
  Poisoned by user​: f7
  Container overflow​: fc
  Array cookie​: ac
  Intra object redzone​: bb
  ASan internal​: fe
  Left alloca redzone​: ca
  Right alloca redzone​: cb
==17395==ABORTING
op/bop.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 370/516 subtests

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Oct 3, 2018

From @pali

On Tuesday 02 October 2018 21​:35​:53 Tony Cook via RT wrote​:

On Wed, 26 Sep 2018 08​:59​:32 -0700, pali@​cpan.org wrote​:

In attachment is a new version of patches.

You originally posted these as RFC, so I won't apply them right now, so some comments​:

a) you modify do_vecset() and do_vecget() presumably to reduce the number of magic calls they make, but you don't test for it (see t/op/bop.t for some examples.)

I dot not know how to write tests for these calls. If somebody else can
prepare them, it would be great.

b) your second patch actually fixes a bug with vecset() on magic scalars, for example, with the following in bop.t​:

my $zstr = "d";
utf8​::upgrade($zstr);
tie $z, "main", $zstr;
is((vec($z, 0, 1) = 1), 1, "vecset (trivial utf8)");
is(fetches($z), 1, "vecset fetches (trivial utf8)");
is(stores($z), 1, "vecset stores (trivial utf8)");

the vecset() writes to freed memory (the extra magic calls allocate a new PV, but the code uses the old PV.)[1]

I'm not aware of the fact that patch fixed some memory problem :D

c) None of your new macros/functions have documentation

Where should be documentation put?

d) I'm not sure the _or_null macros add much.

I described reason in the first email.

Example​:

void
function(arg)
  SV *arg
PREINIT​:
  char *str;
  STRLEN len;
CODE​:
  str = SvPVutf8_or_null(arg, len);
  ...
  do_something(str, len);
  ...

Basically you do not need to handle undef specially, and write XS
function in less lines.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Oct 4, 2018

From @tonycoz

On Wed, 03 Oct 2018 02​:00​:56 -0700, pali@​cpan.org wrote​:

On Tuesday 02 October 2018 21​:35​:53 Tony Cook via RT wrote​:

On Wed, 26 Sep 2018 08​:59​:32 -0700, pali@​cpan.org wrote​:

In attachment is a new version of patches.

You originally posted these as RFC, so I won't apply them right now,
so some comments​:

a) you modify do_vecset() and do_vecget() presumably to reduce the
number of magic calls they make, but you don't test for it (see
t/op/bop.t for some examples.)

I dot not know how to write tests for these calls. If somebody else
can
prepare them, it would be great.

bop.t has code that does similar checks for the number of magic calls done.

b) your second patch actually fixes a bug with vecset() on magic
scalars, for example, with the following in bop.t​:

my $zstr = "d";
utf8​::upgrade($zstr);
tie $z, "main", $zstr;
is((vec($z, 0, 1) = 1), 1, "vecset (trivial utf8)");
is(fetches($z), 1, "vecset fetches (trivial utf8)");
is(stores($z), 1, "vecset stores (trivial utf8)");

the vecset() writes to freed memory (the extra magic calls allocate a
new PV, but the code uses the old PV.)[1]

I'm not aware of the fact that patch fixed some memory problem :D

c) None of your new macros/functions have documentation

Where should be documentation put?

Where other similar code is documented.

The SvPV macros in sv.h, the new downgrade where the old downgrade is documented.

Tony

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 28, 2019

From @pali

In attachment is V5 version of patches. There is updated documentation for them.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 28, 2019

From @pali

v5-0001-Implement-sv_utf8_downgrade_nomg.patch
From d70eebea73163b8ad76da81702eb5508cc7252a4 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [PATCH v5 1/4] Implement sv_utf8_downgrade_nomg

---
 embed.fnc |  4 +++-
 embed.h   |  2 +-
 mathoms.c |  8 ++++++++
 proto.h   |  6 ++++++
 sv.c      | 21 ++++++++++++++++++---
 sv.h      |  2 ++
 6 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 2f8dd63487..3e020bf7d8 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2045,7 +2045,9 @@ ApmdbR	|char*	|sv_pvutf8	|NN SV *sv
 ApmdbR	|char*	|sv_pvbyte	|NN SV *sv
 Apmdb	|STRLEN	|sv_utf8_upgrade|NN SV *sv
 Amd	|STRLEN	|sv_utf8_upgrade_nomg|NN SV *sv
-Apd	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+Apdmb	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+Amd	|bool	|sv_utf8_downgrade_nomg|NN SV *const sv|const bool fail_ok
+Apd	|bool	|sv_utf8_downgrade_flags|NN SV *const sv|const bool fail_ok|const U32 flags
 Apd	|void	|sv_utf8_encode |NN SV *const sv
 Apd	|bool	|sv_utf8_decode |NN SV *const sv
 Apdmb	|void	|sv_force_normal|NN SV *sv
diff --git a/embed.h b/embed.h
index 9439f4083b..54005c4d01 100644
--- a/embed.h
+++ b/embed.h
@@ -850,7 +850,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #ifndef NO_MATHOMS
diff --git a/mathoms.c b/mathoms.c
index b8dcb8913d..da52a41604 100644
--- a/mathoms.c
+++ b/mathoms.c
@@ -1761,6 +1761,14 @@ Perl_newSVsv(pTHX_ SV *const old)
     return newSVsv(old);
 }
 
+bool
+Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+{
+    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
+
+    return sv_utf8_downgrade(sv, fail_ok);
+}
+
 #endif /* NO_MATHOMS */
 
 /*
diff --git a/proto.h b/proto.h
index 500c5813c6..c6ad3ddc2e 100644
--- a/proto.h
+++ b/proto.h
@@ -3661,9 +3661,15 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
+#ifndef NO_MATHOMS
 PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
 #define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
 	assert(sv)
+#endif
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS	\
+	assert(sv)
+/* PERL_CALLCONV bool	sv_utf8_downgrade_nomg(pTHX_ SV *const sv, const bool fail_ok); */
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_ENCODE	\
 	assert(sv)
diff --git a/sv.c b/sv.c
index 2123cf497b..2ab5f224c8 100644
--- a/sv.c
+++ b/sv.c
@@ -3648,11 +3648,23 @@ true, croaks.
 This is not a general purpose Unicode to byte encoding interface:
 use the C<Encode> extension for that.
 
+This function process get magic on C<sv>.
+
+=for apidoc sv_utf8_downgrade_nomg
+
+Like C<sv_utf8_downgrade>, but does not process get magic on C<sv>.
+
+=for apidoc sv_utf8_downgrade_flags
+
+Like C<sv_utf8_downgrade>, but with additional C<flags>.
+If C<flags> has C<SV_GMAGIC> bit set, then this function process
+get magic on C<sv>.
+
 =cut
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
     PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
 
@@ -3660,7 +3672,10 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+	    int mg_flags = 0;
+
+            if (flags & SV_GMAGIC)
+                mg_flags = SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3670,7 +3685,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index 3a648e4971..deb365ea84 100644
--- a/sv.h
+++ b/sv.h
@@ -1915,6 +1915,8 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 28, 2019

From @pali

v5-0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From 35002d119911774e3d5d3f7ee4a80c7742aa41fd Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [PATCH v5 2/4] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 54e35f10a6..febff6e217 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -934,10 +934,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 28, 2019

From @pali

v5-0003-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From 16531de1850359b66e1bbf3afdc8fde218fbb596 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [PATCH v5 3/4] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.fnc |  6 ++++--
 embed.h   |  4 ++--
 mathoms.c | 16 ++++++++++++++++
 proto.h   | 10 ++++++++++
 sv.c      | 24 ++++++++++++++----------
 sv.h      | 16 ++++++++++++++++
 6 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 3e020bf7d8..e429ab065b 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1538,8 +1538,10 @@ Apd	|NV	|sv_2nv_flags	|NN SV *const sv|const I32 flags
 pMd	|SV*	|sv_2num	|NN SV *const sv
 Apmb	|char*	|sv_2pv		|NN SV *sv|NULLOK STRLEN *lp
 Apd	|char*	|sv_2pv_flags	|NN SV *const sv|NULLOK STRLEN *const lp|const I32 flags
-Apd	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
-Apd	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apdmb	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
+Xpd	|char*	|sv_2pvutf8_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
+Apdmb	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Xpd	|char*	|sv_2pvbyte_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
 Abp	|char*	|sv_pvn_nomg	|NN SV* sv|NULLOK STRLEN* lp
 Apmb	|UV	|sv_2uv		|NN SV *sv
 Apd	|UV	|sv_2uv_flags	|NN SV *const sv|const I32 flags
diff --git a/embed.h b/embed.h
index 54005c4d01..8f3f369540 100644
--- a/embed.h
+++ b/embed.h
@@ -723,8 +723,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)	Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)	Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/mathoms.c b/mathoms.c
index da52a41604..1ba9c2367a 100644
--- a/mathoms.c
+++ b/mathoms.c
@@ -1769,6 +1769,22 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
     return sv_utf8_downgrade(sv, fail_ok);
 }
 
+char *
+Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+{
+    PERL_ARGS_ASSERT_SV_2PVUTF8;
+
+    return sv_2pvutf8(sv, lp);
+}
+
+char *
+Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+{
+    PERL_ARGS_ASSERT_SV_2PVBYTE;
+
+    return sv_2pvbyte(sv, lp);
+}
+
 #endif /* NO_MATHOMS */
 
 /*
diff --git a/proto.h b/proto.h
index c6ad3ddc2e..2c6892bf5a 100644
--- a/proto.h
+++ b/proto.h
@@ -3192,9 +3192,14 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVBYTE	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
@@ -3202,9 +3207,14 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVUTF8	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
diff --git a/sv.c b/sv.c
index 2ab5f224c8..fd837de54f 100644
--- a/sv.c
+++ b/sv.c
@@ -3321,18 +3321,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVBYTE;
+    PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3348,15 +3349,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVUTF8;
+    PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index deb365ea84..9fe5359e92 100644
--- a/sv.h
+++ b/sv.h
@@ -1596,6 +1596,9 @@ Like C<SvPV_force>, but converts C<sv> to UTF-8 first if necessary.
 =for apidoc Am|char*|SvPVutf8|SV* sv|STRLEN len
 Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
 
+=for apidoc Am|char*|SvPVutf8_nomg|SV* sv|STRLEN len
+Like C<SvPVutf8>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVutf8_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.
 
@@ -1605,6 +1608,9 @@ Like C<SvPV_force>, but converts C<sv> to byte representation first if necessary
 =for apidoc Am|char*|SvPVbyte|SV* sv|STRLEN len
 Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
 
+=for apidoc Am|char*|SvPVbyte_nomg|SV* sv|STRLEN len
+Like C<SvPVbyte>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVbyte_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.
 
@@ -1726,6 +1732,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1740,6 +1750,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1931,7 +1945,9 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Feb 28, 2019

From @pali

v5-0004-Implement-SvPV-_or_null.patch
From 3aa1c8d06aa8c762967d499659c3ba3524fd4720 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [PATCH v5 4/4] Implement SvPV*_or_null*

---
 sv.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/sv.h b/sv.h
index 9fe5359e92..41cb597b1f 100644
--- a/sv.h
+++ b/sv.h
@@ -1599,6 +1599,12 @@ Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
 =for apidoc Am|char*|SvPVutf8_nomg|SV* sv|STRLEN len
 Like C<SvPVutf8>, but does not process get magic.
 
+=for apidoc Am|char*|SvPVutf8_or_null|SV* sv|STRLEN len
+Like C<SvPVutf8>, but when C<sv> is undef then returns C<NULL>.
+
+=for apidoc Am|char*|SvPVutf8_or_null_nomg|SV* sv|STRLEN len
+Like C<SvPVutf8_or_null>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVutf8_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.
 
@@ -1611,6 +1617,12 @@ Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
 =for apidoc Am|char*|SvPVbyte_nomg|SV* sv|STRLEN len
 Like C<SvPVbyte>, but does not process get magic.
 
+=for apidoc Am|char*|SvPVbyte_or_null|SV* sv|STRLEN len
+Like C<SvPVbyte>, but when C<sv> is undef then returns C<NULL>.
+
+=for apidoc Am|char*|SvPVbyte_or_null_nomg|SV* sv|STRLEN len
+Like C<SvPVbyte_or_null>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVbyte_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.
 
@@ -1732,10 +1744,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1750,10 +1772,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 19, 2019

From @pali

On Thursday 28 February 2019 15​:57​:17 pali@​cpan.org wrote​:

In attachment is V5 version of patches. There is updated documentation for them.

Tony, can you look at new patch version?

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 20, 2019

From @tonycoz

On Tue, 19 Mar 2019 01​:55​:18 -0700, pali@​cpan.org wrote​:

On Thursday 28 February 2019 15​:57​:17 pali@​cpan.org wrote​:

In attachment is V5 version of patches. There is updated
documentation for them.

Tony, can you look at new patch version?

make test fails.

There still aren't any vec() tests (see the attached.)

There's no tests for the new APIs (which can go in ext/XS-APItest/)

Tony

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 20, 2019

From @tonycoz

0001-test-for-desired-magic-fetches-stores-for-vec.patch
From 6477fcd4c8ae0e0ad01dd73006587cf6f14f9ebb Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 20 Mar 2019 10:53:17 +1100
Subject: test for desired magic fetches/stores for vec()

for #132782
---
 t/op/bop.t | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/t/op/bop.t b/t/op/bop.t
index 411d253a7b..a741cfbbaf 100644
--- a/t/op/bop.t
+++ b/t/op/bop.t
@@ -18,7 +18,7 @@ BEGIN {
 # If you find tests are failing, please try adding names to tests to track
 # down where the failure is, and supply your new names as a patch.
 # (Just-in-time test naming)
-plan tests => 504;
+plan tests => 514;
 
 # numerics
 ok ((0xdead & 0xbeef) == 0x9ead);
@@ -262,6 +262,47 @@ is(~~$y, "c");
 is(fetches($y), 1);
 is(stores($y), 0);
 
+my $g;
+# Note: if the vec() reads are part of the is() calls it's treated as
+# in lvalue context, so we save it separately
+$g = vec($x, 0, 1);
+is($g, (ord("a") & 0x01), "check vec value");
+{
+local $TODO = "fetches for vec read broken";
+is(fetches($x), 1, "fetches for vec read");
+}
+is(stores($x), 0, "stores for vec read");
+# similarly here, and code like:
+#   $g = (vec($x, 0, 1) = 0)
+# results in an extra fetch, since the inner assignment returns the LV
+vec($x, 0, 1) = 0;
+# one fetch in vec() another when the LV is assigned to
+{
+local $TODO = "fetches for vec read broken";
+is(fetches($x), 2, "fetches for vec write");
+}
+is(stores($x), 1, "stores for vec write");
+
+{
+    my $a = "a";
+    utf8::upgrade($a);
+    tie $x, "main", $a;
+    $g = vec($x, 0, 1);
+    is($g, (ord("a") & 0x01), "check vec value (utf8)");
+    {
+    local $TODO = "fetches for vec read broken";
+    is(fetches($x), 1, "fetches for vec read (utf8)");
+    }
+    is(stores($x), 0, "stores for vec read (utf8)");
+    vec($x, 0, 1) = 0;
+    {
+    local $TODO = "fetches for vec read broken";
+    # one fetch in vec() another when the LV is assigned to
+    is(fetches($x), 2, "fetches for vec write (utf8)");
+    }
+    is(stores($x), 1, "stores for vec write (utf8)");
+}
+
 $a = "\0\x{100}"; chop($a);
 ok(utf8::is_utf8($a)); # make sure UTF8 flag is still there
 $a = ~$a;
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 20, 2019

From @tonycoz

On Tue, 19 Mar 2019 20​:49​:46 -0700, tonyc wrote​:

On Tue, 19 Mar 2019 01​:55​:18 -0700, pali@​cpan.org wrote​:

On Thursday 28 February 2019 15​:57​:17 pali@​cpan.org wrote​:

In attachment is V5 version of patches. There is updated
documentation for them.

Tony, can you look at new patch version?

make test fails.

There still aren't any vec() tests (see the attached.)

There's no tests for the new APIs (which can go in ext/XS-APItest/)

Tony

Also​:

- int mg_flags = SV_GMAGIC;
+ int mg_flags = 0;
+
+ if (flags & SV_GMAGIC)
+ mg_flags = SV_GMAGIC;

Can just be​:

  int mg_flags = flags & SV_GMAGIC;

but should probably be​:

  U32 mg_flags = flags & SV_GMAGIC;

Tony

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 28, 2019

From @pali

On Tuesday 19 March 2019 20​:59​:22 Tony Cook via RT wrote​:

On Tue, 19 Mar 2019 20​:49​:46 -0700, tonyc wrote​:

On Tue, 19 Mar 2019 01​:55​:18 -0700, pali@​cpan.org wrote​:

On Thursday 28 February 2019 15​:57​:17 pali@​cpan.org wrote​:

In attachment is V5 version of patches. There is updated
documentation for them.

Tony, can you look at new patch version?

make test fails.

Fixed.

There still aren't any vec() tests (see the attached.)

I included that patch into v6 version.

There's no tests for the new APIs (which can go in ext/XS-APItest/)

Tony

Also​:

- int mg_flags = SV_GMAGIC;
+ int mg_flags = 0;
+
+ if (flags & SV_GMAGIC)
+ mg_flags = SV_GMAGIC;

Can just be​:

int mg_flags = flags & SV_GMAGIC;

but should probably be​:

U32 mg_flags = flags & SV_GMAGIC;

Changed.

New version v6 is in attachment.

Now make test passes. There are some problems with porting/pod_rules.t
and porting/podcheck.t but I have not touched code for those files... so
looks like false-positive.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 28, 2019

From @pali

v6-0001-Implement-sv_utf8_downgrade_nomg.patch
From b053620962bfd8b3a5bef073d530fd9fed5e8066 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [PATCH 1/5] Implement sv_utf8_downgrade_nomg

---
 embed.fnc |  4 +++-
 embed.h   |  2 +-
 mathoms.c |  8 ++++++++
 proto.h   |  6 ++++++
 sv.c      | 20 ++++++++++++++++----
 sv.h      |  2 ++
 6 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index e2ca5c5808..b5a4c41451 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2045,7 +2045,9 @@ ApmdbR	|char*	|sv_pvutf8	|NN SV *sv
 ApmdbR	|char*	|sv_pvbyte	|NN SV *sv
 Apmdb	|STRLEN	|sv_utf8_upgrade|NN SV *sv
 Amd	|STRLEN	|sv_utf8_upgrade_nomg|NN SV *sv
-Apd	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+Apdmb	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+Amd	|bool	|sv_utf8_downgrade_nomg|NN SV *const sv|const bool fail_ok
+Apd	|bool	|sv_utf8_downgrade_flags|NN SV *const sv|const bool fail_ok|const U32 flags
 Apd	|void	|sv_utf8_encode |NN SV *const sv
 Apd	|bool	|sv_utf8_decode |NN SV *const sv
 Apdmb	|void	|sv_force_normal|NN SV *sv
diff --git a/embed.h b/embed.h
index 94acff2444..eb56be425e 100644
--- a/embed.h
+++ b/embed.h
@@ -850,7 +850,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #ifndef NO_MATHOMS
diff --git a/mathoms.c b/mathoms.c
index b8dcb8913d..da52a41604 100644
--- a/mathoms.c
+++ b/mathoms.c
@@ -1761,6 +1761,14 @@ Perl_newSVsv(pTHX_ SV *const old)
     return newSVsv(old);
 }
 
+bool
+Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+{
+    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
+
+    return sv_utf8_downgrade(sv, fail_ok);
+}
+
 #endif /* NO_MATHOMS */
 
 /*
diff --git a/proto.h b/proto.h
index b9662c6a33..a7f0ff3395 100644
--- a/proto.h
+++ b/proto.h
@@ -3666,9 +3666,15 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
+#ifndef NO_MATHOMS
 PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
 #define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
 	assert(sv)
+#endif
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS	\
+	assert(sv)
+/* PERL_CALLCONV bool	sv_utf8_downgrade_nomg(pTHX_ SV *const sv, const bool fail_ok); */
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_ENCODE	\
 	assert(sv)
diff --git a/sv.c b/sv.c
index b6d9123971..565a46aac1 100644
--- a/sv.c
+++ b/sv.c
@@ -3648,19 +3648,31 @@ true, croaks.
 This is not a general purpose Unicode to byte encoding interface:
 use the C<Encode> extension for that.
 
+This function process get magic on C<sv>.
+
+=for apidoc sv_utf8_downgrade_nomg
+
+Like C<sv_utf8_downgrade>, but does not process get magic on C<sv>.
+
+=for apidoc sv_utf8_downgrade_flags
+
+Like C<sv_utf8_downgrade>, but with additional C<flags>.
+If C<flags> has C<SV_GMAGIC> bit set, then this function process
+get magic on C<sv>.
+
 =cut
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
+    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS;
 
     if (SvPOKp(sv) && SvUTF8(sv)) {
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+            U32 mg_flags = flags & SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3670,7 +3682,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index 3a648e4971..deb365ea84 100644
--- a/sv.h
+++ b/sv.h
@@ -1915,6 +1915,8 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 28, 2019

From @pali

v6-0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From b7ea3a69ae731dfc2ea96a04699630285b259f33 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [PATCH 2/5] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 54e35f10a6..febff6e217 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -934,10 +934,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 28, 2019

From @pali

v6-0003-test-for-desired-magic-fetches-stores-for-vec.patch
From 30be11b11c45dd6d01564b6e375d5b406dd4d0af Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 20 Mar 2019 10:53:17 +1100
Subject: [PATCH 3/5] test for desired magic fetches/stores for vec()

for #132782
---
 t/op/bop.t | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/t/op/bop.t b/t/op/bop.t
index 411d253a7b..a741cfbbaf 100644
--- a/t/op/bop.t
+++ b/t/op/bop.t
@@ -18,7 +18,7 @@ BEGIN {
 # If you find tests are failing, please try adding names to tests to track
 # down where the failure is, and supply your new names as a patch.
 # (Just-in-time test naming)
-plan tests => 504;
+plan tests => 514;
 
 # numerics
 ok ((0xdead & 0xbeef) == 0x9ead);
@@ -262,6 +262,47 @@ is(~~$y, "c");
 is(fetches($y), 1);
 is(stores($y), 0);
 
+my $g;
+# Note: if the vec() reads are part of the is() calls it's treated as
+# in lvalue context, so we save it separately
+$g = vec($x, 0, 1);
+is($g, (ord("a") & 0x01), "check vec value");
+{
+local $TODO = "fetches for vec read broken";
+is(fetches($x), 1, "fetches for vec read");
+}
+is(stores($x), 0, "stores for vec read");
+# similarly here, and code like:
+#   $g = (vec($x, 0, 1) = 0)
+# results in an extra fetch, since the inner assignment returns the LV
+vec($x, 0, 1) = 0;
+# one fetch in vec() another when the LV is assigned to
+{
+local $TODO = "fetches for vec read broken";
+is(fetches($x), 2, "fetches for vec write");
+}
+is(stores($x), 1, "stores for vec write");
+
+{
+    my $a = "a";
+    utf8::upgrade($a);
+    tie $x, "main", $a;
+    $g = vec($x, 0, 1);
+    is($g, (ord("a") & 0x01), "check vec value (utf8)");
+    {
+    local $TODO = "fetches for vec read broken";
+    is(fetches($x), 1, "fetches for vec read (utf8)");
+    }
+    is(stores($x), 0, "stores for vec read (utf8)");
+    vec($x, 0, 1) = 0;
+    {
+    local $TODO = "fetches for vec read broken";
+    # one fetch in vec() another when the LV is assigned to
+    is(fetches($x), 2, "fetches for vec write (utf8)");
+    }
+    is(stores($x), 1, "stores for vec write (utf8)");
+}
+
 $a = "\0\x{100}"; chop($a);
 ok(utf8::is_utf8($a)); # make sure UTF8 flag is still there
 $a = ~$a;
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 28, 2019

From @pali

v6-0004-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From 6083c520f2be25a874867a1ab791cd6d63e4317e Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [PATCH 4/5] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.fnc |  6 ++++--
 embed.h   |  4 ++--
 mathoms.c | 16 ++++++++++++++++
 proto.h   | 10 ++++++++++
 sv.c      | 24 ++++++++++++++----------
 sv.h      | 16 ++++++++++++++++
 6 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index b5a4c41451..03dafca381 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1538,8 +1538,10 @@ Apd	|NV	|sv_2nv_flags	|NN SV *const sv|const I32 flags
 pMd	|SV*	|sv_2num	|NN SV *const sv
 Apmb	|char*	|sv_2pv		|NN SV *sv|NULLOK STRLEN *lp
 Apd	|char*	|sv_2pv_flags	|NN SV *const sv|NULLOK STRLEN *const lp|const I32 flags
-Apd	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
-Apd	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apdmb	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
+Ap	|char*	|sv_2pvutf8_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
+Apdmb	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Ap	|char*	|sv_2pvbyte_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
 Abp	|char*	|sv_pvn_nomg	|NN SV* sv|NULLOK STRLEN* lp
 Apmb	|UV	|sv_2uv		|NN SV *sv
 Apd	|UV	|sv_2uv_flags	|NN SV *const sv|const I32 flags
diff --git a/embed.h b/embed.h
index eb56be425e..fe562286bb 100644
--- a/embed.h
+++ b/embed.h
@@ -723,8 +723,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)	Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)	Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/mathoms.c b/mathoms.c
index da52a41604..1ba9c2367a 100644
--- a/mathoms.c
+++ b/mathoms.c
@@ -1769,6 +1769,22 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
     return sv_utf8_downgrade(sv, fail_ok);
 }
 
+char *
+Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+{
+    PERL_ARGS_ASSERT_SV_2PVUTF8;
+
+    return sv_2pvutf8(sv, lp);
+}
+
+char *
+Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+{
+    PERL_ARGS_ASSERT_SV_2PVBYTE;
+
+    return sv_2pvbyte(sv, lp);
+}
+
 #endif /* NO_MATHOMS */
 
 /*
diff --git a/proto.h b/proto.h
index a7f0ff3395..f67ffc47c8 100644
--- a/proto.h
+++ b/proto.h
@@ -3197,9 +3197,14 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVBYTE	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
@@ -3207,9 +3212,14 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVUTF8	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
diff --git a/sv.c b/sv.c
index 565a46aac1..02e74b5372 100644
--- a/sv.c
+++ b/sv.c
@@ -3321,18 +3321,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVBYTE;
+    PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3348,15 +3349,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVUTF8;
+    PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index deb365ea84..9fe5359e92 100644
--- a/sv.h
+++ b/sv.h
@@ -1596,6 +1596,9 @@ Like C<SvPV_force>, but converts C<sv> to UTF-8 first if necessary.
 =for apidoc Am|char*|SvPVutf8|SV* sv|STRLEN len
 Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
 
+=for apidoc Am|char*|SvPVutf8_nomg|SV* sv|STRLEN len
+Like C<SvPVutf8>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVutf8_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.
 
@@ -1605,6 +1608,9 @@ Like C<SvPV_force>, but converts C<sv> to byte representation first if necessary
 =for apidoc Am|char*|SvPVbyte|SV* sv|STRLEN len
 Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
 
+=for apidoc Am|char*|SvPVbyte_nomg|SV* sv|STRLEN len
+Like C<SvPVbyte>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVbyte_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.
 
@@ -1726,6 +1732,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1740,6 +1750,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1931,7 +1945,9 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Mar 28, 2019

From @pali

v6-0005-Implement-SvPV-_or_null.patch
From 560dfa83bee3e9efde3a13bc5f9b831ec43b1ad9 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [PATCH 5/5] Implement SvPV*_or_null*

---
 sv.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/sv.h b/sv.h
index 9fe5359e92..41cb597b1f 100644
--- a/sv.h
+++ b/sv.h
@@ -1599,6 +1599,12 @@ Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
 =for apidoc Am|char*|SvPVutf8_nomg|SV* sv|STRLEN len
 Like C<SvPVutf8>, but does not process get magic.
 
+=for apidoc Am|char*|SvPVutf8_or_null|SV* sv|STRLEN len
+Like C<SvPVutf8>, but when C<sv> is undef then returns C<NULL>.
+
+=for apidoc Am|char*|SvPVutf8_or_null_nomg|SV* sv|STRLEN len
+Like C<SvPVutf8_or_null>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVutf8_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.
 
@@ -1611,6 +1617,12 @@ Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
 =for apidoc Am|char*|SvPVbyte_nomg|SV* sv|STRLEN len
 Like C<SvPVbyte>, but does not process get magic.
 
+=for apidoc Am|char*|SvPVbyte_or_null|SV* sv|STRLEN len
+Like C<SvPVbyte>, but when C<sv> is undef then returns C<NULL>.
+
+=for apidoc Am|char*|SvPVbyte_or_null_nomg|SV* sv|STRLEN len
+Like C<SvPVbyte_or_null>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVbyte_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.
 
@@ -1732,10 +1744,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1750,10 +1772,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented May 19, 2019

From @pali

On Thursday 28 March 2019 13​:41​:30 pali@​cpan.org wrote​:

On Tuesday 19 March 2019 20​:59​:22 Tony Cook via RT wrote​:

On Tue, 19 Mar 2019 20​:49​:46 -0700, tonyc wrote​:

On Tue, 19 Mar 2019 01​:55​:18 -0700, pali@​cpan.org wrote​:

On Thursday 28 February 2019 15​:57​:17 pali@​cpan.org wrote​:

In attachment is V5 version of patches. There is updated
documentation for them.

Tony, can you look at new patch version?

make test fails.

Fixed.

There still aren't any vec() tests (see the attached.)

I included that patch into v6 version.

There's no tests for the new APIs (which can go in ext/XS-APItest/)

Tony

Also​:

- int mg_flags = SV_GMAGIC;
+ int mg_flags = 0;
+
+ if (flags & SV_GMAGIC)
+ mg_flags = SV_GMAGIC;

Can just be​:

int mg_flags = flags & SV_GMAGIC;

but should probably be​:

U32 mg_flags = flags & SV_GMAGIC;

Changed.

New version v6 is in attachment.

Now make test passes. There are some problems with porting/pod_rules.t
and porting/podcheck.t but I have not touched code for those files... so
looks like false-positive.

Tony, can you review V6 patches?

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 22, 2019

From @pali

On Tuesday 21 May 2019 10​:17​:47 Tony Cook wrote​:

The only two problems​:

1) The TODO tests need to be disarmed.

op/bop.t (Wstat​: 0 Tests​: 514 Failed​: 0)
TODO passed​: 141, 143, 146, 148
Files=2656, Tests=1220001, 815 wallclock secs (318.43 usr 18.79 sys + 2117.74 cusr 78.50 csys = 2533.46 CPU)
Result​: PASS

You've added the TODO tests after the fixes, you could either remove
the TODO sets entirely, or move the test commit before the doop.c
changes and disarm the TODO sets in the doop.c commit (or a later
commit.)

I removed TODO marks in this test.

2) The other problem is there's still no API tests, as I said in
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=132782#txn-1619990

There's no tests for the new APIs (which can go in ext/XS-APItest/)

I added XS-API tests for SvPVutf8_nomg and SvPVbyte_nomg.

Please look at updated V7 patches.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 22, 2019

From @pali

v7-0001-Implement-sv_utf8_downgrade_nomg.patch
From c77f48b199535c89bd57ad056c7df362266e6a1f Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:40:47 +0100
Subject: [PATCH 1/5] Implement sv_utf8_downgrade_nomg

---
 embed.fnc |  4 +++-
 embed.h   |  2 +-
 mathoms.c |  8 ++++++++
 proto.h   |  6 ++++++
 sv.c      | 20 ++++++++++++++++----
 sv.h      |  2 ++
 6 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index aa23c1a6c9..07287892f3 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2111,7 +2111,9 @@ ApmdbR	|char*	|sv_pvutf8	|NN SV *sv
 ApmdbR	|char*	|sv_pvbyte	|NN SV *sv
 Apmdb	|STRLEN	|sv_utf8_upgrade|NN SV *sv
 Amd	|STRLEN	|sv_utf8_upgrade_nomg|NN SV *sv
-Apd	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+Apdmb	|bool	|sv_utf8_downgrade|NN SV *const sv|const bool fail_ok
+Amd	|bool	|sv_utf8_downgrade_nomg|NN SV *const sv|const bool fail_ok
+Apd	|bool	|sv_utf8_downgrade_flags|NN SV *const sv|const bool fail_ok|const U32 flags
 Apd	|void	|sv_utf8_encode |NN SV *const sv
 Apd	|bool	|sv_utf8_decode |NN SV *const sv
 Apdmb	|void	|sv_force_normal|NN SV *sv
diff --git a/embed.h b/embed.h
index 78659236b4..5f7cb5f58f 100644
--- a/embed.h
+++ b/embed.h
@@ -854,7 +854,7 @@
 #define sv_upgrade(a,b)		Perl_sv_upgrade(aTHX_ a,b)
 #define sv_usepvn_flags(a,b,c,d)	Perl_sv_usepvn_flags(aTHX_ a,b,c,d)
 #define sv_utf8_decode(a)	Perl_sv_utf8_decode(aTHX_ a)
-#define sv_utf8_downgrade(a,b)	Perl_sv_utf8_downgrade(aTHX_ a,b)
+#define sv_utf8_downgrade_flags(a,b,c)	Perl_sv_utf8_downgrade_flags(aTHX_ a,b,c)
 #define sv_utf8_encode(a)	Perl_sv_utf8_encode(aTHX_ a)
 #define sv_utf8_upgrade_flags_grow(a,b,c)	Perl_sv_utf8_upgrade_flags_grow(aTHX_ a,b,c)
 #ifndef NO_MATHOMS
diff --git a/mathoms.c b/mathoms.c
index e2dc11c142..6450291317 100644
--- a/mathoms.c
+++ b/mathoms.c
@@ -1761,6 +1761,14 @@ Perl_newSVsv(pTHX_ SV *const old)
     return newSVsv(old);
 }
 
+bool
+Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+{
+    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
+
+    return sv_utf8_downgrade(sv, fail_ok);
+}
+
 #endif /* NO_MATHOMS */
 
 /*
diff --git a/proto.h b/proto.h
index 5eaec442b2..ac500a581e 100644
--- a/proto.h
+++ b/proto.h
@@ -3701,9 +3701,15 @@ PERL_CALLCONV void	Perl_sv_usepvn_mg(pTHX_ SV *sv, char *ptr, STRLEN len);
 PERL_CALLCONV bool	Perl_sv_utf8_decode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_DECODE	\
 	assert(sv)
+#ifndef NO_MATHOMS
 PERL_CALLCONV bool	Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok);
 #define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE	\
 	assert(sv)
+#endif
+PERL_CALLCONV bool	Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS	\
+	assert(sv)
+/* PERL_CALLCONV bool	sv_utf8_downgrade_nomg(pTHX_ SV *const sv, const bool fail_ok); */
 PERL_CALLCONV void	Perl_sv_utf8_encode(pTHX_ SV *const sv);
 #define PERL_ARGS_ASSERT_SV_UTF8_ENCODE	\
 	assert(sv)
diff --git a/sv.c b/sv.c
index 4315fe9b64..f6665bc030 100644
--- a/sv.c
+++ b/sv.c
@@ -3649,19 +3649,31 @@ true, croaks.
 This is not a general purpose Unicode to byte encoding interface:
 use the C<Encode> extension for that.
 
+This function process get magic on C<sv>.
+
+=for apidoc sv_utf8_downgrade_nomg
+
+Like C<sv_utf8_downgrade>, but does not process get magic on C<sv>.
+
+=for apidoc sv_utf8_downgrade_flags
+
+Like C<sv_utf8_downgrade>, but with additional C<flags>.
+If C<flags> has C<SV_GMAGIC> bit set, then this function process
+get magic on C<sv>.
+
 =cut
 */
 
 bool
-Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
+Perl_sv_utf8_downgrade_flags(pTHX_ SV *const sv, const bool fail_ok, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE;
+    PERL_ARGS_ASSERT_SV_UTF8_DOWNGRADE_FLAGS;
 
     if (SvPOKp(sv) && SvUTF8(sv)) {
         if (SvCUR(sv)) {
 	    U8 *s;
 	    STRLEN len;
-	    int mg_flags = SV_GMAGIC;
+            U32 mg_flags = flags & SV_GMAGIC;
 
             if (SvIsCOW(sv)) {
                 S_sv_uncow(aTHX_ sv, 0);
@@ -3671,7 +3683,7 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
 		MAGIC * mg = mg_find(sv, PERL_MAGIC_regex_global);
 		if (mg && mg->mg_len > 0 && mg->mg_flags & MGf_BYTES) {
 			mg->mg_len = sv_pos_b2u_flags(sv, mg->mg_len,
-						SV_GMAGIC|SV_CONST_RETURN);
+						mg_flags|SV_CONST_RETURN);
 			mg_flags = 0; /* sv_pos_b2u does get magic */
 		}
 		if ((mg = mg_find(sv, PERL_MAGIC_utf8)))
diff --git a/sv.h b/sv.h
index 24c728dcd2..53aea18aeb 100644
--- a/sv.h
+++ b/sv.h
@@ -1941,6 +1941,8 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_pvn_force_nomg(sv, lp) sv_pvn_force_flags(sv, lp, 0)
 #define sv_utf8_upgrade_flags(sv, flags) sv_utf8_upgrade_flags_grow(sv, flags, 0)
 #define sv_utf8_upgrade_nomg(sv) sv_utf8_upgrade_flags(sv, 0)
+#define sv_utf8_downgrade(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, SV_GMAGIC)
+#define sv_utf8_downgrade_nomg(sv, fail_ok) sv_utf8_downgrade_flags(sv, fail_ok, 0)
 #define sv_catpvn_nomg(dsv, sstr, slen) sv_catpvn_flags(dsv, sstr, slen, 0)
 #define sv_catpv_nomg(dsv, sstr) sv_catpv_flags(dsv, sstr, 0)
 #define sv_setsv(dsv, ssv) \
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 22, 2019

From @pali

v7-0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
From 79605516735866f2c42e6a2e7b5ca979a610842f Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:19 +0100
Subject: [PATCH 2/5] Fix do_vecget and do_vecset to process GET magic only
 once

---
 doop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doop.c b/doop.c
index 00edfcc2e7..49f71e681d 100644
--- a/doop.c
+++ b/doop.c
@@ -758,7 +758,7 @@ Perl_do_vecget(pTHX_ SV *sv, STRLEN offset, int size)
 	Perl_croak(aTHX_ "Illegal number of bits in vec");
 
     if (SvUTF8(sv)) {
-	if (Perl_sv_utf8_downgrade(aTHX_ sv, TRUE)) {
+	if (Perl_sv_utf8_downgrade_flags(aTHX_ sv, TRUE, 0)) {
             /* PVX may have changed */
             s = (unsigned char *) SvPV_flags(sv, srclen, svpv_flags);
         }
@@ -931,10 +931,10 @@ Perl_do_vecset(pTHX_ SV *sv)
                                          SV_GMAGIC | SV_UNDEF_RETURNS_NULL);
     if (SvUTF8(targ)) {
 	/* This is handled by the SvPOK_only below...
-	if (!Perl_sv_utf8_downgrade(aTHX_ targ, TRUE))
+	if (!Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0))
 	    SvUTF8_off(targ);
 	 */
-	(void) Perl_sv_utf8_downgrade(aTHX_ targ, TRUE);
+	(void) Perl_sv_utf8_downgrade_flags(aTHX_ targ, TRUE, 0);
     }
 
     (void)SvPOK_only(targ);
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 22, 2019

From @pali

v7-0003-test-for-desired-magic-fetches-stores-for-vec.patch
From 3a1e7b5df42099af3eef92a2c51423de3552e749 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 20 Mar 2019 10:53:17 +1100
Subject: [PATCH 3/5] test for desired magic fetches/stores for vec()

for #132782
---
 t/op/bop.t | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/t/op/bop.t b/t/op/bop.t
index 666dfb8114..efc6172fc1 100644
--- a/t/op/bop.t
+++ b/t/op/bop.t
@@ -18,7 +18,7 @@ BEGIN {
 # If you find tests are failing, please try adding names to tests to track
 # down where the failure is, and supply your new names as a patch.
 # (Just-in-time test naming)
-plan tests => 491;
+plan tests => 501;
 
 # numerics
 ok ((0xdead & 0xbeef) == 0x9ead);
@@ -262,6 +262,35 @@ is(~~$y, "c");
 is(fetches($y), 1);
 is(stores($y), 0);
 
+my $g;
+# Note: if the vec() reads are part of the is() calls it's treated as
+# in lvalue context, so we save it separately
+$g = vec($x, 0, 1);
+is($g, (ord("a") & 0x01), "check vec value");
+is(fetches($x), 1, "fetches for vec read");
+is(stores($x), 0, "stores for vec read");
+# similarly here, and code like:
+#   $g = (vec($x, 0, 1) = 0)
+# results in an extra fetch, since the inner assignment returns the LV
+vec($x, 0, 1) = 0;
+# one fetch in vec() another when the LV is assigned to
+is(fetches($x), 2, "fetches for vec write");
+is(stores($x), 1, "stores for vec write");
+
+{
+    my $a = "a";
+    utf8::upgrade($a);
+    tie $x, "main", $a;
+    $g = vec($x, 0, 1);
+    is($g, (ord("a") & 0x01), "check vec value (utf8)");
+    is(fetches($x), 1, "fetches for vec read (utf8)");
+    is(stores($x), 0, "stores for vec read (utf8)");
+    vec($x, 0, 1) = 0;
+    # one fetch in vec() another when the LV is assigned to
+    is(fetches($x), 2, "fetches for vec write (utf8)");
+    is(stores($x), 1, "stores for vec write (utf8)");
+}
+
 $a = "\0\x{100}"; chop($a);
 ok(utf8::is_utf8($a)); # make sure UTF8 flag is still there
 $a = ~$a;
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 22, 2019

From @pali

v7-0004-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
From f496f6b5cc720e9bfaa27910cb18fef3bf2c33a5 Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 13:41:46 +0100
Subject: [PATCH 4/5] Implement SvPVutf8_nomg and SvPVbyte_nomg

---
 embed.fnc                 |  6 ++++--
 embed.h                   |  4 ++--
 ext/XS-APItest/APItest.xs | 14 ++++++++++++++
 ext/XS-APItest/t/svpv.t   | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 mathoms.c                 | 16 ++++++++++++++++
 proto.h                   | 10 ++++++++++
 sv.c                      | 24 ++++++++++++++----------
 sv.h                      | 16 ++++++++++++++++
 8 files changed, 122 insertions(+), 15 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 07287892f3..eaf459bcf9 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1603,8 +1603,10 @@ Apd	|NV	|sv_2nv_flags	|NN SV *const sv|const I32 flags
 pxd	|SV*	|sv_2num	|NN SV *const sv
 Apmb	|char*	|sv_2pv		|NN SV *sv|NULLOK STRLEN *lp
 Apd	|char*	|sv_2pv_flags	|NN SV *const sv|NULLOK STRLEN *const lp|const I32 flags
-Apd	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
-Apd	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Apdmb	|char*	|sv_2pvutf8	|NN SV *sv|NULLOK STRLEN *const lp
+Ap	|char*	|sv_2pvutf8_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
+Apdmb	|char*	|sv_2pvbyte	|NN SV *sv|NULLOK STRLEN *const lp
+Ap	|char*	|sv_2pvbyte_flags	|NN SV *sv|NULLOK STRLEN *const lp|const U32 flags
 Abp	|char*	|sv_pvn_nomg	|NN SV* sv|NULLOK STRLEN* lp
 Apmb	|UV	|sv_2uv		|NN SV *sv
 Apd	|UV	|sv_2uv_flags	|NN SV *const sv|const I32 flags
diff --git a/embed.h b/embed.h
index 5f7cb5f58f..450755b912 100644
--- a/embed.h
+++ b/embed.h
@@ -726,8 +726,8 @@
 #define sv_2mortal(a)		Perl_sv_2mortal(aTHX_ a)
 #define sv_2nv_flags(a,b)	Perl_sv_2nv_flags(aTHX_ a,b)
 #define sv_2pv_flags(a,b,c)	Perl_sv_2pv_flags(aTHX_ a,b,c)
-#define sv_2pvbyte(a,b)		Perl_sv_2pvbyte(aTHX_ a,b)
-#define sv_2pvutf8(a,b)		Perl_sv_2pvutf8(aTHX_ a,b)
+#define sv_2pvbyte_flags(a,b,c)	Perl_sv_2pvbyte_flags(aTHX_ a,b,c)
+#define sv_2pvutf8_flags(a,b,c)	Perl_sv_2pvutf8_flags(aTHX_ a,b,c)
 #define sv_2uv_flags(a,b)	Perl_sv_2uv_flags(aTHX_ a,b)
 #define sv_backoff		Perl_sv_backoff
 #define sv_bless(a,b)		Perl_sv_bless(aTHX_ a,b)
diff --git a/ext/XS-APItest/APItest.xs b/ext/XS-APItest/APItest.xs
index 132372c752..d1ca8f94b3 100644
--- a/ext/XS-APItest/APItest.xs
+++ b/ext/XS-APItest/APItest.xs
@@ -4219,12 +4219,26 @@ OUTPUT:
     RETVAL
 
 char *
+SvPVbyte_nomg(SV *sv)
+CODE:
+    RETVAL = SvPVbyte_nomg(sv, PL_na);
+OUTPUT:
+    RETVAL
+
+char *
 SvPVutf8(SV *sv)
 CODE:
     RETVAL = SvPVutf8_nolen(sv);
 OUTPUT:
     RETVAL
 
+char *
+SvPVutf8_nomg(SV *sv)
+CODE:
+    RETVAL = SvPVutf8_nomg(sv, PL_na);
+OUTPUT:
+    RETVAL
+
 void
 setup_addissub()
 CODE:
diff --git a/ext/XS-APItest/t/svpv.t b/ext/XS-APItest/t/svpv.t
index 4602891405..4a27d29729 100644
--- a/ext/XS-APItest/t/svpv.t
+++ b/ext/XS-APItest/t/svpv.t
@@ -1,6 +1,6 @@
 #!perl -w
 
-use Test::More tests => 19;
+use Test::More tests => 35;
 
 use XS::APItest;
 
@@ -18,6 +18,32 @@ for my $func ('SvPVbyte', 'SvPVutf8') {
  is ref\$^V, 'REF', "$func(\$ro_ref) does not flatten the ref";
 }
 
+my $data_bin = "\xC4\x8D";
+utf8::downgrade($data_bin);
+tie my $scalar_bin, 'TieScalarCounter', $data_bin;
+do { my $fetch = $scalar_bin };
+is tied($scalar_bin)->{fetch}, 1;
+is tied($scalar_bin)->{store}, 0;
+is SvPVutf8_nomg($scalar_bin), "\xC3\x84\xC2\x8D";
+is tied($scalar_bin)->{fetch}, 1;
+is tied($scalar_bin)->{store}, 0;
+is SvPVbyte_nomg($scalar_bin), "\xC4\x8D";
+is tied($scalar_bin)->{fetch}, 1;
+is tied($scalar_bin)->{store}, 0;
+
+my $data_uni = "\xC4\x8D";
+utf8::upgrade($data_uni);
+tie my $scalar_uni, 'TieScalarCounter', $data_uni;
+do { my $fetch = $scalar_uni };
+is tied($scalar_uni)->{fetch}, 1;
+is tied($scalar_uni)->{store}, 0;
+is SvPVbyte_nomg($scalar_uni), "\xC4\x8D";
+is tied($scalar_uni)->{fetch}, 1;
+is tied($scalar_uni)->{store}, 0;
+is SvPVutf8_nomg($scalar_uni), "\xC3\x84\xC2\x8D";
+is tied($scalar_uni)->{fetch}, 1;
+is tied($scalar_uni)->{store}, 0;
+
 eval 'SvPVbyte(*{chr 256})';
 like $@, qr/^Wide character/, 'SvPVbyte fails on Unicode glob';
 package r { use overload '""' => sub { substr "\x{100}\xff", -1 } }
@@ -29,3 +55,22 @@ sub FETCH { ${ +shift } }
 tie $tyre, main => bless [], r::;
 is SvPVbyte($tyre), "\xff",
   'SvPVbyte on tie returning ref that returns downgradable utf8 string';
+
+package TieScalarCounter;
+
+sub TIESCALAR {
+    my ($class, $value) = @_;
+    return bless { fetch => 0, store => 0, value => $value }, $class;
+}
+
+sub FETCH {
+    my ($self) = @_;
+    $self->{fetch}++;
+    return $self->{value};
+}
+
+sub STORE {
+    my ($self, $value) = @_;
+    $self->{store}++;
+    $self->{value} = $value;
+}
diff --git a/mathoms.c b/mathoms.c
index 6450291317..65bf267943 100644
--- a/mathoms.c
+++ b/mathoms.c
@@ -1769,6 +1769,22 @@ Perl_sv_utf8_downgrade(pTHX_ SV *const sv, const bool fail_ok)
     return sv_utf8_downgrade(sv, fail_ok);
 }
 
+char *
+Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+{
+    PERL_ARGS_ASSERT_SV_2PVUTF8;
+
+    return sv_2pvutf8(sv, lp);
+}
+
+char *
+Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+{
+    PERL_ARGS_ASSERT_SV_2PVBYTE;
+
+    return sv_2pvbyte(sv, lp);
+}
+
 #endif /* NO_MATHOMS */
 
 /*
diff --git a/proto.h b/proto.h
index ac500a581e..0f42160063 100644
--- a/proto.h
+++ b/proto.h
@@ -3232,9 +3232,14 @@ PERL_CALLCONV char*	Perl_sv_2pv_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVBYTE	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
@@ -3242,9 +3247,14 @@ PERL_CALLCONV char*	Perl_sv_2pvbyte_nolen(pTHX_ SV* sv)
 	assert(sv)
 #endif
 
+#ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp);
 #define PERL_ARGS_ASSERT_SV_2PVUTF8	\
 	assert(sv)
+#endif
+PERL_CALLCONV char*	Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags);
+#define PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS	\
+	assert(sv)
 #ifndef NO_MATHOMS
 PERL_CALLCONV char*	Perl_sv_2pvutf8_nolen(pTHX_ SV* sv)
 			__attribute__warn_unused_result__;
diff --git a/sv.c b/sv.c
index f6665bc030..ba7a566aeb 100644
--- a/sv.c
+++ b/sv.c
@@ -3322,18 +3322,19 @@ Usually accessed via the C<SvPVbyte> macro.
 */
 
 char *
-Perl_sv_2pvbyte(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvbyte_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVBYTE;
+    PERL_ARGS_ASSERT_SV_2PVBYTE_FLAGS;
 
-    SvGETMAGIC(sv);
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
      || isGV_with_GP(sv) || SvROK(sv)) {
 	SV *sv2 = sv_newmortal();
 	sv_copypv_nomg(sv2,sv);
 	sv = sv2;
     }
-    sv_utf8_downgrade(sv,0);
+    sv_utf8_downgrade_nomg(sv,0);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
 
@@ -3349,15 +3350,18 @@ Usually accessed via the C<SvPVutf8> macro.
 */
 
 char *
-Perl_sv_2pvutf8(pTHX_ SV *sv, STRLEN *const lp)
+Perl_sv_2pvutf8_flags(pTHX_ SV *sv, STRLEN *const lp, const U32 flags)
 {
-    PERL_ARGS_ASSERT_SV_2PVUTF8;
+    PERL_ARGS_ASSERT_SV_2PVUTF8_FLAGS;
 
+    if (SvGMAGICAL(sv) && (flags & SV_GMAGIC))
+        mg_get(sv);
     if (((SvREADONLY(sv) || SvFAKE(sv)) && !SvIsCOW(sv))
-     || isGV_with_GP(sv) || SvROK(sv))
-	sv = sv_mortalcopy(sv);
-    else
-        SvGETMAGIC(sv);
+     || isGV_with_GP(sv) || SvROK(sv)) {
+        SV *sv2 = sv_newmortal();
+        sv_copypv_nomg(sv2,sv);
+        sv = sv2;
+    }
     sv_utf8_upgrade_nomg(sv);
     return lp ? SvPV_nomg(sv,*lp) : SvPV_nomg_nolen(sv);
 }
diff --git a/sv.h b/sv.h
index 53aea18aeb..1f24f773a0 100644
--- a/sv.h
+++ b/sv.h
@@ -1622,6 +1622,9 @@ Like C<SvPV_force>, but converts C<sv> to UTF-8 first if necessary.
 =for apidoc Am|char*|SvPVutf8|SV* sv|STRLEN len
 Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
 
+=for apidoc Am|char*|SvPVutf8_nomg|SV* sv|STRLEN len
+Like C<SvPVutf8>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVutf8_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.
 
@@ -1631,6 +1634,9 @@ Like C<SvPV_force>, but converts C<sv> to byte representation first if necessary
 =for apidoc Am|char*|SvPVbyte|SV* sv|STRLEN len
 Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
 
+=for apidoc Am|char*|SvPVbyte_nomg|SV* sv|STRLEN len
+Like C<SvPVbyte>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVbyte_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.
 
@@ -1752,6 +1758,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1766,6 +1776,10 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_nomg(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
@@ -1957,7 +1971,9 @@ Like C<sv_catsv> but doesn't process magic.
 #define sv_copypv_nomg(dsv, ssv) sv_copypv_flags(dsv, ssv, 0)
 #define sv_2pv(sv, lp) sv_2pv_flags(sv, lp, SV_GMAGIC)
 #define sv_2pv_nolen(sv) sv_2pv(sv, 0)
+#define sv_2pvbyte(sv, lp) sv_2pvbyte_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvbyte_nolen(sv) sv_2pvbyte(sv, 0)
+#define sv_2pvutf8(sv, lp) sv_2pvutf8_flags(sv, lp, SV_GMAGIC)
 #define sv_2pvutf8_nolen(sv) sv_2pvutf8(sv, 0)
 #define sv_2pv_nomg(sv, lp) sv_2pv_flags(sv, lp, 0)
 #define sv_pvn_force(sv, lp) sv_pvn_force_flags(sv, lp, SV_GMAGIC)
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jul 22, 2019

From @pali

v7-0005-Implement-SvPV-_or_null.patch
From 7d2308925b36601e2817a2f6f5ee77f405a95e5f Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Sat, 10 Feb 2018 15:10:04 +0100
Subject: [PATCH 5/5] Implement SvPV*_or_null*

---
 sv.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/sv.h b/sv.h
index 1f24f773a0..56a7cc50dc 100644
--- a/sv.h
+++ b/sv.h
@@ -1625,6 +1625,12 @@ Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
 =for apidoc Am|char*|SvPVutf8_nomg|SV* sv|STRLEN len
 Like C<SvPVutf8>, but does not process get magic.
 
+=for apidoc Am|char*|SvPVutf8_or_null|SV* sv|STRLEN len
+Like C<SvPVutf8>, but when C<sv> is undef then returns C<NULL>.
+
+=for apidoc Am|char*|SvPVutf8_or_null_nomg|SV* sv|STRLEN len
+Like C<SvPVutf8_or_null>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVutf8_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.
 
@@ -1637,6 +1643,12 @@ Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
 =for apidoc Am|char*|SvPVbyte_nomg|SV* sv|STRLEN len
 Like C<SvPVbyte>, but does not process get magic.
 
+=for apidoc Am|char*|SvPVbyte_or_null|SV* sv|STRLEN len
+Like C<SvPVbyte>, but when C<sv> is undef then returns C<NULL>.
+
+=for apidoc Am|char*|SvPVbyte_or_null_nomg|SV* sv|STRLEN len
+Like C<SvPVbyte_or_null>, but does not process get magic.
+
 =for apidoc Am|char*|SvPVbyte_nolen|SV* sv
 Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.
 
@@ -1758,10 +1770,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8(sv, &lp))
 
+#define SvPVutf8_or_null(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_nomg(sv, lp) \
     (SvPOK_utf8_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvutf8_flags(sv, &lp, 0))
 
+#define SvPVutf8_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvutf8_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVutf8_force(sv, lp) \
     (SvPOK_utf8_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvutf8n_force(sv, &lp))
@@ -1776,10 +1798,20 @@ Like C<sv_catsv> but doesn't process magic.
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte(sv, &lp))
 
+#define SvPVbyte_or_null(sv, lp) \
+    (SvPOK_byte_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : (SvGETMAGIC(sv), SvOK(sv)) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_nomg(sv, lp) \
     (SvPOK_byte_nog(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_2pvbyte_flags(sv, &lp, 0))
 
+#define SvPVbyte_or_null_nomg(sv, lp) \
+    (SvPOK_utf8_nog(sv) \
+     ? ((lp = SvCUR(sv)), SvPVX(sv)) : SvOK(sv) \
+     ? sv_2pvbyte_flags(sv, &lp, 0) : ((lp = 0), NULL))
+
 #define SvPVbyte_force(sv, lp) \
     (SvPOK_byte_pure_nogthink(sv) \
      ? ((lp = SvCUR(sv)), SvPVX(sv)) : sv_pvbyten_force(sv, &lp))
-- 
2.11.0

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Aug 6, 2019

From @pali

On Monday 22 July 2019 15​:53​:30 pali@​cpan.org wrote​:

On Tuesday 21 May 2019 10​:17​:47 Tony Cook wrote​:

The only two problems​:

1) The TODO tests need to be disarmed.

op/bop.t (Wstat​: 0 Tests​: 514 Failed​: 0)
TODO passed​: 141, 143, 146, 148
Files=2656, Tests=1220001, 815 wallclock secs (318.43 usr 18.79 sys + 2117.74 cusr 78.50 csys = 2533.46 CPU)
Result​: PASS

You've added the TODO tests after the fixes, you could either remove
the TODO sets entirely, or move the test commit before the doop.c
changes and disarm the TODO sets in the doop.c commit (or a later
commit.)

I removed TODO marks in this test.

2) The other problem is there's still no API tests, as I said in
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=132782#txn-1619990

There's no tests for the new APIs (which can go in ext/XS-APItest/)

I added XS-API tests for SvPVutf8_nomg and SvPVbyte_nomg.

Please look at updated V7 patches.

Hi Tony! It is now OK? Or is something more needed for these patches?

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Aug 30, 2019

From @pali

On Tuesday 06 August 2019 11​:03​:47 pali@​cpan.org wrote​:

On Monday 22 July 2019 15​:53​:30 pali@​cpan.org wrote​:

On Tuesday 21 May 2019 10​:17​:47 Tony Cook wrote​:

The only two problems​:

1) The TODO tests need to be disarmed.

op/bop.t (Wstat​: 0 Tests​: 514 Failed​: 0)
TODO passed​: 141, 143, 146, 148
Files=2656, Tests=1220001, 815 wallclock secs (318.43 usr 18.79 sys + 2117.74 cusr 78.50 csys = 2533.46 CPU)
Result​: PASS

You've added the TODO tests after the fixes, you could either remove
the TODO sets entirely, or move the test commit before the doop.c
changes and disarm the TODO sets in the doop.c commit (or a later
commit.)

I removed TODO marks in this test.

2) The other problem is there's still no API tests, as I said in
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=132782#txn-1619990

There's no tests for the new APIs (which can go in ext/XS-APItest/)

I added XS-API tests for SvPVutf8_nomg and SvPVbyte_nomg.

Please look at updated V7 patches.

Hi Tony! It is now OK? Or is something more needed for these patches?

Tony​: ping

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 2, 2019

From @tonycoz

On Fri, 30 Aug 2019 05​:00​:52 -0700, pali@​cpan.org wrote​:

On Tuesday 06 August 2019 11​:03​:47 pali@​cpan.org wrote​:

On Monday 22 July 2019 15​:53​:30 pali@​cpan.org wrote​:

On Tuesday 21 May 2019 10​:17​:47 Tony Cook wrote​:

The only two problems​:

1) The TODO tests need to be disarmed.

op/bop.t
(Wstat​: 0 Tests​: 514 Failed​: 0)
TODO passed​: 141, 143, 146, 148
Files=2656, Tests=1220001, 815 wallclock secs (318.43 usr 18.79
sys + 2117.74 cusr 78.50 csys = 2533.46 CPU)
Result​: PASS

You've added the TODO tests after the fixes, you could either
remove
the TODO sets entirely, or move the test commit before the doop.c
changes and disarm the TODO sets in the doop.c commit (or a later
commit.)

I removed TODO marks in this test.

2) The other problem is there's still no API tests, as I said in
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=132782#txn-1619990

There's no tests for the new APIs (which can go in ext/XS-
APItest/)

I added XS-API tests for SvPVutf8_nomg and SvPVbyte_nomg.

Please look at updated V7 patches.

Hi Tony! It is now OK? Or is something more needed for these patches?

Tony​: ping

Thanks, applied as 423ce62 through 3e13e8f.

Sorry about the lack of follow-up, I forgot to bookmark this ticket and lost track of it otherwise.

Tony

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 2, 2019

@tonycoz - Status changed from 'open' to 'pending release'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant