Switch branches/tags
Commits on Sep 18, 2013
  1. android: Remove builtin_compiler

    groleo committed with chadversary Sep 13, 2013
    The first part was done in:
       commit c845140
       Author: Kenneth Graunke <>
       Date:   Tue Sep 3 21:22:17 2013 -0700
    Signed-off-by: Adrian Negreanu <>
    Acked-by: Ian Romanick <>
    Reviewed-by: Chad Versace <>
  2. util/u_blit: Implement util_blit_pixels via pipe_context::blit.

    jrfonseca committed Sep 12, 2013
    This removes a lot of code, but not everything, as util_blit_pixels_tex
    is still useful when one needs to override pipe_sampler_view::swizzle_?.
    Reviewed-by: Zack Rusin <>
    Reviewed-by: Marek Olšák <>
    Reviewed-by: Roland Scheidegger <>
  3. util/u_blit: Support blits from cubemaps.

    jrfonseca committed Sep 17, 2013
    By calling util_map_texcoords2d_onto_cubemap.
    A new parameter for util_blit_pixels_tex is necessary, as
    pipe_sampler_view::first_layer is always supposed to point to the first
    face when sampling from cubemaps.
    Reviewed-by: Zack Rusin <>
    Reviewed-by: Marek Olšák <>
    Reviewed-by: Roland Scheidegger <>
  4. vega: Use pipe_context::blit instead of util_blit_pixels_tex.

    jrfonseca committed Sep 17, 2013
    Only compile-tested but it seems straightforward.
    Reviewed-by: Zack Rusin <>
    Reviewed-by: Marek Olšák <>
    Reviewed-by: Roland Scheidegger <>
  5. i965: Rename brw_{fs,vec4}_emit.cpp to brw_{fs,vec4}_generator.cpp.

    kaydenl committed Sep 18, 2013
    The previous names were really confusing to talk about:
    - brw_fs_visitor() contained methods named emit_whatever().
    - brw_fs_generator() contained methods named generate_whatever(), but
      lived in brw_fs_emit.cpp.
    So when someone said "the emit layer", or "emit code", we weren't sure
    whether they meant the visitor's emit() functions or the generator in
    By renaming these files, the method names, class names, and file names
    all match, which is much less confusing.
    Signed-off-by: Kenneth Graunke <>
    Acked-by: Paul Berry <>
    Acked-by: Eric Anholt <>
  6. glsl: Correctly validate fma()'s types.

    mattst88 committed Sep 6, 2013
    lrp() can take a scalar as a third argument, and fma() cannot.
    Reviewed-by: Kenneth Graunke <>
  7. glsl: Add frexp signatures and implementation.

    mattst88 committed Sep 9, 2013
    I initially implemented frexp() as an IR opcode with a lowering pass,
    but since it returns a value and has an out-parameter, it would break
    assumptions our optimization passes make about ir_expressions being pure
    (i.e., having no side effects).
    For example, if opt_tree_grafting encounters this code:
    uniform float u;
    void main()
      int exp;
      float f = frexp(u, out exp);
      float g = float(exp)/256.0;
      float h = float(exp) + 1.0;
      gl_FragColor = vec4(f, g, h, g + h);
    it may try to optimize it to this:
    uniform float u;
    void main()
      int exp;
      float g = float(exp)/256.0;
      float h = float(exp) + 1.0;
      gl_FragColor = vec4(frexp(u, out exp), g, h, g + h);
    Some hardware has an instruction which performs frexp(), but we would
    need some other compiler infrastructure to be able to generate it, such
    as an intrinsics system that would allow backends to emit specific code
    for particular bits of IR.
    Reviewed-by: Paul Berry <>
Commits on Sep 17, 2013
  1. i965: Lower ldexp.

    mattst88 committed Aug 3, 2013
    v2: Drop frexp lowering.
    Reviewed-by: Paul Berry <>
  2. glsl: Add ldexp_to_arith lowering pass.

    mattst88 committed Aug 3, 2013
    Reviewed-by: Paul Berry <>
  3. glsl: Allow vectors to be created from ir_constant().

    mattst88 committed Aug 5, 2013
    Note the parameter name change in the int version of ir_constant, to
    avoid the conflict with the loop iterator.
    v2: Make analogous change to builtin_builder::imm().
    Reviewed-by: Paul Berry <>
  4. glsl: Add support for ldexp.

    mattst88 committed Aug 22, 2013
    v2: Drop frexp. Rebase on builtins rewrite.
    Reviewed-by: Paul Berry <>
  5. i965: Add some missing bits to {mesa,brw,cache}_bits[].

    stereotype441 committed Sep 2, 2013
    These data structures are used for debug output, so it wasn't hurting
    anything that there were missing bits.  But it's good to keep things
    up to date.
    This patch also adds static asserts so that the {brw,cache}_bits[]
    arrays are the proper size, so that we don't forget to add to them in
    the future.  Unfortunately there's no convenient way to assert that
    mesa_bits[] is the proper size.
    Reviewed-by: Kenneth Graunke <>
  6. i965/gs: Implement basic gl_PrimitiveIDIn functionality.

    stereotype441 committed Aug 12, 2013
    If the geometry shader refers to the built-in variable
    gl_PrimitiveIDIn, we need to set a bit in 3DSTATE_GS to tell the
    hardware to dispatch primitive ID to r1, and we need to leave room for
    it when allocating registers.
    Note: this feature doesn't yet work properly when software primitive
    restart is in use (the primitive ID counter will incorrectly reset
    with each primitive restart, since software primitive restart works by
    performing multiple draw calls).  I plan to address that in a future
    patch series.
    Fixes piglit test "spec/glsl-1.50/execution/geometry/primitive-id-in".
    Reviewed-by: Kenneth Graunke <>
  7. i965/gs: New gs primitive types are supported by HW primitive restart.

    stereotype441 committed Aug 27, 2013
    When we previously implemented primitive restart, we didn't add cases
    to brw_primitive_restart.c's can_cut_index_handle_prims() for the
    primitive types that are introduced with geometry shaders.  It turns
    out that all of the new primitive types are supported by hardware
    primitive restart.
    Reviewed-by: Kenneth Graunke <>
  8. i965/gs: Add new primitive types.

    stereotype441 committed Apr 28, 2013
    As part of its support for geometry shaders, GL 3.2 introduces four
    Reviewed-by: Kenneth Graunke <>
  9. gallivm: some bits of seamless cube filtering implementation

    Roland Scheidegger committed Sep 13, 2013
    Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a
    correct implementation for nearest filtering, and it's way better than
    using repeat wrap for instance for linear filtering (though obviously this
    doesn't actually do seamless filtering).
    v2: fix s/t wrap not r/s...
    Reviewed-by: Brian Paul <>
    Reviewed-by: Jose Fonseca <>
  10. i965: Remove MIPLAYOUT_BELOW from Gen4-6 constant buffer surface state.

    kaydenl committed Sep 14, 2013
    Specifying a miptree layout makes no sense for constant buffers.
    This has no functional change since BRW_SURFACE_MIPMAPLAYOUT_BELOW is
    just a #define for 0.
    Signed-off-by: Kenneth Graunke <>
    Reviewed-by: Paul Berry <>
  11. egl: Also add EGL_TEXTURE_FORMAT as a valid eglQueryWaylandBufferWL a…

    krh committed Sep 17, 2013
    Now that we have a table of accepted eglQueryWaylandBufferWL() attributes,
    we should also list EGL_TEXTURE_FORMAT.
  12. egl: add EGL_WAYLAND_Y_INVERTED_WL attribute

    Stanislav Vorobiov committed with krh Sep 16, 2013
    This enables querying of wl_buffer's orientation
  13. i965: Use gen7_upload_constant_state for 3DSTATE_CONSTANT_PS as well.

    kaydenl committed Sep 13, 2013
    Now we use gen7_upload_constant_state() for all three shader stages.
    Signed-off-by: Kenneth Graunke <>
    Reviewed-by: Paul Berry <>
  14. i965: Set brw_stage_state::push_const_size for PS constants.

    kaydenl committed Sep 13, 2013
    This paves the way for using gen7_upload_constant_state for PS data.
    The formula is copied from gen7_wm_state.c.
    Signed-off-by: Kenneth Graunke <>
    Reviewed-by: Paul Berry <>
  15. i965: Introduce a prog_data temporary in gen6_upload_wm_push_constants.

    kaydenl committed Sep 13, 2013
    This saves a bit of typing and shortens a few lines.
    Signed-off-by: Kenneth Graunke <>
    Reviewed-by: Paul Berry <>
Commits on Sep 16, 2013
  1. i965/gen6+: Support 128 varying components.

    stereotype441 committed Sep 3, 2013
    GL 3.2 requires us to support 128 varying components for geometry
    shader outputs and fragment shader inputs, and 64 varying components
    otherwise.  But there's no hardware limitation that restricts us to 64
    varying components, and core Mesa doesn't currently allow different
    stages to have different maximum values, so just go ahead and enable
    128 varying components for all stages.  This gets us better test
    coverage anyway.
    Even though we are only working on GL 3.2 support for gen7 right now,
    gen6 also supports 128 varying components, so go ahead and switch it
    on there too.
    Reviewed-by: Kenneth Graunke <>
  2. i965/ff_gs: Generate URB writes using a loop.

    stereotype441 committed Sep 3, 2013
    Previously we only ever did 1 URB write, since the maximum number of
    varyings we support is small enough to fit in 1 URB write (when using
    BRW_URB_SWIZZLE_NONE, which is what the pre-Gen7 GS always uses).  But
    we're about to increase the number of varying components we support
    from 64 to 128.
    With 128 varyings, the most URB writes we'll have to do is 2, but it's
    just as easy to write a general-purpose loop.
    Reviewed-by: Kenneth Graunke <>
  3. i965/gen6: Fix assertions on VS/GS URB size.

    stereotype441 committed Sep 3, 2013
    The "{VS,GS} URB Entry Allocation Size" fields of 3DSTATE_URB allow
    values in the range 0-4, but they are U8-1 fields, so the range of
    possible allocation sizes is 1-5.  We were erroneously prohibiting a
    size of 5.
    Reviewed-by: Kenneth Graunke <>
  4. i965/vec4: Generate URB writes using a loop.

    stereotype441 committed Sep 3, 2013
    Previously we only ever did 1 or 2 URB writes, since the maximum
    number of varyings we support is small enough to fit in 2 URB writes.
    But GL 3.2 requires the geometry shader to support 128 output varying
    components, and this could require up to 3 URB writes.
    Reviewed-by: Kenneth Graunke <>
  5. i965/fs: When >64 input components, order them to match prev pipeline…

    stereotype441 committed Sep 3, 2013
    … stage.
    Since the SF/SBE stage is only capable of performing arbitrary
    reorderings of 16 varying slots, we can't arrange the fragment shader
    inputs in an arbitrary order if there are more than 16 input varying
    slots in use.  We need to make sure that slots 16-31 match the
    corresponding outputs of the previous pipeline stage.
    The easiest way to accomplish this is to just make all varying slots
    match up with the previous pipeline stage.
    Reviewed-by: Kenneth Graunke <>
  6. i965/fs: Simplify computation of key.input_slots_valid during precomp…

    stereotype441 committed Sep 3, 2013
    The for loop was rather silly.  In addition to checking brw->gen < 6
    on each loop iteration, it took pains to exclude bits from
    fp->Base.InputsRead that don't correspond to fragment shader inputs.
    But those bits would never have been set in the first place, since the
    only bits that are ever set in fp->Base.InputsRead are fragment shader
    Reviewed-by: Kenneth Graunke <>
  7. i965/gs: Stop storing an input VUE map in the GS program key.

    stereotype441 committed Sep 2, 2013
    Now that the vertex shader output VUE map is determined solely by a
    64-bit bitfield, we don't have to store it in its entirety in the
    geometry shader program key; instead, we can just store the bitfield,
    and let the geometry shader infer the VUE map at compile time.
    This dramatically reduces the size of the geometry shader program key,
    which we want to keep small since it gets recomputed whenever the
    active program changes.
    Reviewed-by: Kenneth Graunke <>
  8. i965/gen6+: Remove VUE map dependency on userclip_active.

    stereotype441 committed Sep 2, 2013
    Previously, on Gen6+, we laid out the vertex (or geometry) shader VUE
    map differently depending whether user clipping was active.  If it was
    active, we put the clip distances in slots 2 and 3 (where the clipper
    expects them); if it was inactive, we assigned them in the order of
    the gl_varying_slot enum.
    This made for unnecessary recompiles, since turning clipping on/off
    for a shader that used gl_ClipDistance might rearrange the varyings.
    It also required extra bookkeeping, since it required the user
    clipping flag to be provided to brw_compute_vue_map() as a parameter.
    With this patch, we always put clip distances at in slots 2 and 3 if
    they are written to.  do_vs_prog() and do_gs_prog() are responsible
    for ensuring that clip distances are written to when user clipping is
    enabled (as do_vs_prog() previously did for gen4-5).
    This makes the only input to brw_compute_vue_map() a bitfield of which
    varyings the shader writes to, a fact that we'll take advantage of in
    forthcoming patches.
    Reviewed-by: Kenneth Graunke <>
  9. i965/fs: Stop wasting input attribute space on gl_FragCoord and gl_Fr…

    stereotype441 committed Sep 3, 2013
    Previously, if a fragment shader accessed gl_FragCoord or
    gl_FrontFacing, we would assign them their own slots in the fragment
    shader input attribute array, using up space that could be made
    available to real varyings.  This was not strictly necessary (since
    these values are not true varyings, and are instead computed from
    other data available in the FS payload).  But we had to do it anyway
    because the SF/SBE setup code assumed that every 1 bit in the
    gl_program::InputsRead bitfield corresponded to a genuine varying
    Now that the SF/SBE code consults brw_wm_prog_data and only sets up
    the attributes that the fragment shader actually needs, we don't have
    to do this anymore.
    Reviewed-by: Kenneth Graunke <>
  10. i965/sf: Consult brw_wm_prog_data when setting up SF/SBE state.

    stereotype441 committed Sep 3, 2013
    Previously, the SF/SBE setup code delivered varying inputs to the FS
    in the order in which they appear in the gl_program::InputsRead
    bitfield, since that's what the FS expects.
    When we add support for more than 64 varying components, this will no
    longer always be the case, because the Gen6+ SF/SBE stage is only
    capable of performing arbitrary reorderings of 16 varying slots.  So,
    when there are more than 16 vec4's worth of varying inputs, the FS
    will have to adjust the order its input varyings in order to partially
    match the order of outputs from the geometry or vertex shader.
    To allow extra flexibility in the ordering of FS varyings, this patch
    causes the SF/SBE to deliver varying inputs to the FS in exactly the
    order that the FS requests, by consulting brw_wm_prog_data::urb_setup
    and brw_wm_prog_data::num_varying_inputs.
    Reviewed-by: Kenneth Graunke <>
  11. i965/sf: Consolidate common code for setting up gen6-7 attribute over…

    stereotype441 committed Sep 3, 2013
    Reviewed-by: Kenneth Graunke <>
  12. i965/sf: Use BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoded values.

    stereotype441 committed Sep 2, 2013
    We always program the SF unit to start reading the vertex URB entry at
    offset 1.  In upcoming patches, we'll be adding FS code that relies on
    this.  So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET
    rather than hardcoding a 1.
    Reviewed-by: Kenneth Graunke <>
  13. i965/fs: Consult brw_wm_prog_data::num_varying_inputs when setting up…

    stereotype441 committed Sep 3, 2013
    … WM state.
    Previously, we assumed that the number of varying inputs consumed by
    the fragment shader was equal to the number of bits set in
    gl_program::InputsRead.  However, we'll soon be making two changes
    that will cause that not to be true:
    - We'll stop wasting varying input space for gl_FragCoord and
      gl_FrontFacing, which aren't varyings.
    - For fragment shaders that have more than 16 varying inputs, we'll
      adjust the layout of the inputs to account for the fact that the
      SF/SBE pipeline stage can't reorder inputs beyond the first 16; if
      there are GS outputs that the FS doens't use (or vice versa) this
      may cause the number of FS varying inputs to change.
    So, instead of trying to guess the number of FS inputs from
    gl_program::InputsRead, simply read it from
    brw_wm_prog_data:num_varying_inputs, which is guaranteed to be correct
    since it's populated by fs_visitor::calculate_urb_setup().
    Reviewed-by: Kenneth Graunke <>