Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize fold_ty #107627

Merged
merged 5 commits into from Feb 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
129 changes: 61 additions & 68 deletions compiler/rustc_infer/src/infer/freshen.rs
Expand Up @@ -140,79 +140,21 @@ impl<'a, 'tcx> TypeFolder<'tcx> for TypeFreshener<'a, 'tcx> {
}
}

#[inline]
fn fold_ty(&mut self, t: Ty<'tcx>) -> Ty<'tcx> {
if !t.needs_infer() && !t.has_erasable_regions() {
return t;
}

let tcx = self.infcx.tcx;

match *t.kind() {
ty::Infer(ty::TyVar(v)) => {
let opt_ty = self.infcx.inner.borrow_mut().type_variables().probe(v).known();
self.freshen_ty(opt_ty, ty::TyVar(v), ty::FreshTy)
}
t
} else {
match *t.kind() {
ty::Infer(v) => self.fold_infer_ty(v).unwrap_or(t),

ty::Infer(ty::IntVar(v)) => self.freshen_ty(
self.infcx
.inner
.borrow_mut()
.int_unification_table()
.probe_value(v)
.map(|v| v.to_type(tcx)),
ty::IntVar(v),
ty::FreshIntTy,
),
// This code is hot enough that a non-debug assertion here makes a noticeable
// difference on benchmarks like `wg-grammar`.
#[cfg(debug_assertions)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally think this should stay an "always" assertion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. To avoid any performance issues, this could call a #[cold] function with the bug! inside instead of having the formatting inside the main function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried changing it back to an always assertion, and it had a noticeable perf impact, e.g. the instruction count for wg-grammar increased by 0.7%. I then tried the #[cold] function and it made a small improvement, but was still 0.5% worse.

So I will leave this as is, but I will add a comment about it.

ty::Placeholder(..) | ty::Bound(..) => bug!("unexpected type {:?}", t),

ty::Infer(ty::FloatVar(v)) => self.freshen_ty(
self.infcx
.inner
.borrow_mut()
.float_unification_table()
.probe_value(v)
.map(|v| v.to_type(tcx)),
ty::FloatVar(v),
ty::FreshFloatTy,
),

ty::Infer(ty::FreshTy(ct) | ty::FreshIntTy(ct) | ty::FreshFloatTy(ct)) => {
if ct >= self.ty_freshen_count {
bug!(
"Encountered a freshend type with id {} \
but our counter is only at {}",
ct,
self.ty_freshen_count
);
}
t
_ => t.super_fold_with(self),
}

ty::Generator(..)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional question: Does the wildcard have a perf difference over the exhaustive match? Otherwise, I kinda prefer the exhaustive match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably does. Transforming exhaustive matches to a wildcard in the code generation may be a good idea if so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the use case. If the match can be converted to a table lookup, the exhaustive match will have one less branch in LLVM, but have a bigger lookup table: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c7b21a7f9d032aea5aa261953a85d735

For actual branching logic, it doesn't really matter. There may be a larger lookup table in LLVM IR, but that will become the same thing at the assembly level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the debug assertion for Placeholder and Bound in place, doing an exhaustive match is awkward, so I've left this unchanged as well. If it helps, this leaves this method not dissimilar to ShallowResolver::fold_ty, which has the form if let ty::Infer(v) = ty.kind() { ... } else { ty }.

Copy link
Member

@pnkfelix pnkfelix Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It makes me sad that we discover the need to do micro-optimizations like re-encoding a big or-pattern arm as a wildcard; I, like @compiler-errors, find value in the exhaustive match from the view point of maintenance. Are we keeping track of efforts, if any, to put such a transformation into rustc itself?)

Copy link
Member

@pnkfelix pnkfelix Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I could even imagine an #[rustc_*] attribute that would tell the compiler to convert a given arm into a wild-card at the end of the match. That would provide a way to make @compiler-errors errors and also ease experiments like this one that @nnethercote is doing, right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is not "wildcard is faster than manually listing the alternatives". The problem is the assertion on Placeholder and Bound. A debug assertion is faster, which makes sense. And once you have the debug assertion for those variants, having a wildcard is a lot easier.

If that assertion wasn't necessary, then you can do an exhaustive match that is the same speed as a wildcard match. (I just tried it out; same speed.) Though I would argue that an exhaustive match probably isn't appropriate when ty::Infer gets treatment A and every other variant gets treatment B.

| ty::Bool
| ty::Char
| ty::Int(..)
| ty::Uint(..)
| ty::Float(..)
| ty::Adt(..)
| ty::Str
| ty::Error(_)
| ty::Array(..)
| ty::Slice(..)
| ty::RawPtr(..)
| ty::Ref(..)
| ty::FnDef(..)
| ty::FnPtr(_)
| ty::Dynamic(..)
| ty::Never
| ty::Tuple(..)
| ty::Alias(..)
| ty::Foreign(..)
| ty::Param(..)
| ty::Closure(..)
| ty::GeneratorWitnessMIR(..)
| ty::GeneratorWitness(..) => t.super_fold_with(self),

ty::Placeholder(..) | ty::Bound(..) => bug!("unexpected type {:?}", t),
}
}

Expand Down Expand Up @@ -253,3 +195,54 @@ impl<'a, 'tcx> TypeFolder<'tcx> for TypeFreshener<'a, 'tcx> {
}
}
}

impl<'a, 'tcx> TypeFreshener<'a, 'tcx> {
// This is separate from `fold_ty` to keep that method small and inlinable.
#[inline(never)]
fn fold_infer_ty(&mut self, v: ty::InferTy) -> Option<Ty<'tcx>> {
nnethercote marked this conversation as resolved.
Show resolved Hide resolved
match v {
ty::TyVar(v) => {
let opt_ty = self.infcx.inner.borrow_mut().type_variables().probe(v).known();
Some(self.freshen_ty(opt_ty, ty::TyVar(v), ty::FreshTy))
}

ty::IntVar(v) => Some(
self.freshen_ty(
self.infcx
.inner
.borrow_mut()
.int_unification_table()
.probe_value(v)
.map(|v| v.to_type(self.infcx.tcx)),
ty::IntVar(v),
ty::FreshIntTy,
),
),

ty::FloatVar(v) => Some(
self.freshen_ty(
self.infcx
.inner
.borrow_mut()
.float_unification_table()
.probe_value(v)
.map(|v| v.to_type(self.infcx.tcx)),
ty::FloatVar(v),
ty::FreshFloatTy,
),
),

ty::FreshTy(ct) | ty::FreshIntTy(ct) | ty::FreshFloatTy(ct) => {
if ct >= self.ty_freshen_count {
bug!(
"Encountered a freshend type with id {} \
but our counter is only at {}",
ct,
self.ty_freshen_count
);
}
None
}
}
}
}
61 changes: 35 additions & 26 deletions compiler/rustc_infer/src/infer/mod.rs
Expand Up @@ -30,7 +30,7 @@ use rustc_middle::ty::relate::RelateResult;
use rustc_middle::ty::subst::{GenericArg, GenericArgKind, InternalSubsts, SubstsRef};
use rustc_middle::ty::visit::TypeVisitable;
pub use rustc_middle::ty::IntVarValue;
use rustc_middle::ty::{self, GenericParamDefKind, InferConst, Ty, TyCtxt};
use rustc_middle::ty::{self, GenericParamDefKind, InferConst, InferTy, Ty, TyCtxt};
use rustc_middle::ty::{ConstVid, FloatVid, IntVid, TyVid};
use rustc_span::symbol::Symbol;
use rustc_span::Span;
Expand Down Expand Up @@ -1389,8 +1389,8 @@ impl<'tcx> InferCtxt<'tcx> {
where
T: TypeFoldable<'tcx>,
{
if !value.needs_infer() {
return value; // Avoid duplicated subst-folding.
if !value.has_non_region_infer() {
nnethercote marked this conversation as resolved.
Show resolved Hide resolved
return value;
}
let mut r = resolve::OpportunisticVarResolver::new(self);
value.fold_with(&mut r)
Expand Down Expand Up @@ -1870,9 +1870,33 @@ impl<'a, 'tcx> TypeFolder<'tcx> for ShallowResolver<'a, 'tcx> {
/// If `ty` is a type variable of some kind, resolve it one level
/// (but do not resolve types found in the result). If `typ` is
/// not a type variable, just return it unmodified.
#[inline]
fn fold_ty(&mut self, ty: Ty<'tcx>) -> Ty<'tcx> {
match *ty.kind() {
ty::Infer(ty::TyVar(v)) => {
if let ty::Infer(v) = ty.kind() { self.fold_infer_ty(*v).unwrap_or(ty) } else { ty }
}

fn fold_const(&mut self, ct: ty::Const<'tcx>) -> ty::Const<'tcx> {
if let ty::ConstKind::Infer(InferConst::Var(vid)) = ct.kind() {
self.infcx
.inner
.borrow_mut()
.const_unification_table()
.probe_value(vid)
.val
.known()
.unwrap_or(ct)
} else {
ct
}
}
}

impl<'a, 'tcx> ShallowResolver<'a, 'tcx> {
// This is separate from `fold_ty` to keep that method small and inlinable.
#[inline(never)]
fn fold_infer_ty(&mut self, v: InferTy) -> Option<Ty<'tcx>> {
match v {
ty::TyVar(v) => {
// Not entirely obvious: if `typ` is a type variable,
// it can be resolved to an int/float variable, which
// can then be recursively resolved, hence the
Expand All @@ -1886,41 +1910,26 @@ impl<'a, 'tcx> TypeFolder<'tcx> for ShallowResolver<'a, 'tcx> {
// Note: if these two lines are combined into one we get
// dynamic borrow errors on `self.inner`.
let known = self.infcx.inner.borrow_mut().type_variables().probe(v).known();
known.map_or(ty, |t| self.fold_ty(t))
known.map(|t| self.fold_ty(t))
}

ty::Infer(ty::IntVar(v)) => self
ty::IntVar(v) => self
.infcx
.inner
.borrow_mut()
.int_unification_table()
.probe_value(v)
.map_or(ty, |v| v.to_type(self.infcx.tcx)),
.map(|v| v.to_type(self.infcx.tcx)),

ty::Infer(ty::FloatVar(v)) => self
ty::FloatVar(v) => self
.infcx
.inner
.borrow_mut()
.float_unification_table()
.probe_value(v)
.map_or(ty, |v| v.to_type(self.infcx.tcx)),
.map(|v| v.to_type(self.infcx.tcx)),

_ => ty,
}
}

fn fold_const(&mut self, ct: ty::Const<'tcx>) -> ty::Const<'tcx> {
if let ty::ConstKind::Infer(InferConst::Var(vid)) = ct.kind() {
self.infcx
.inner
.borrow_mut()
.const_unification_table()
.probe_value(vid)
.val
.known()
.unwrap_or(ct)
} else {
ct
ty::FreshTy(_) | ty::FreshIntTy(_) | ty::FreshFloatTy(_) => None,
}
}
}
Expand Down
13 changes: 8 additions & 5 deletions compiler/rustc_infer/src/infer/resolve.rs
Expand Up @@ -16,26 +16,29 @@ use std::ops::ControlFlow;
/// useful for printing messages etc but also required at various
/// points for correctness.
pub struct OpportunisticVarResolver<'a, 'tcx> {
infcx: &'a InferCtxt<'tcx>,
// The shallow resolver is used to resolve inference variables at every
// level of the type.
shallow_resolver: crate::infer::ShallowResolver<'a, 'tcx>,
nnethercote marked this conversation as resolved.
Show resolved Hide resolved
}

impl<'a, 'tcx> OpportunisticVarResolver<'a, 'tcx> {
#[inline]
pub fn new(infcx: &'a InferCtxt<'tcx>) -> Self {
OpportunisticVarResolver { infcx }
OpportunisticVarResolver { shallow_resolver: crate::infer::ShallowResolver { infcx } }
}
}

impl<'a, 'tcx> TypeFolder<'tcx> for OpportunisticVarResolver<'a, 'tcx> {
fn tcx<'b>(&'b self) -> TyCtxt<'tcx> {
self.infcx.tcx
TypeFolder::tcx(&self.shallow_resolver)
}

#[inline]
fn fold_ty(&mut self, t: Ty<'tcx>) -> Ty<'tcx> {
if !t.has_non_region_infer() {
t // micro-optimize -- if there is nothing in this type that this fold affects...
} else {
let t = self.infcx.shallow_resolve(t);
let t = self.shallow_resolver.fold_ty(t);
t.super_fold_with(self)
}
}
Expand All @@ -44,7 +47,7 @@ impl<'a, 'tcx> TypeFolder<'tcx> for OpportunisticVarResolver<'a, 'tcx> {
if !ct.has_non_region_infer() {
ct // micro-optimize -- if there is nothing in this const that this fold affects...
} else {
let ct = self.infcx.shallow_resolve(ct);
let ct = self.shallow_resolver.fold_const(ct);
ct.super_fold_with(self)
}
}
Expand Down